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FIGURE 1 

Plasmid sequence of pNC5LSPCEAp53 (pMC30B5) for vCP2086 
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GCCCTTT 


CGTCTCG 


CGCGTTT 


5 




CGGGAAA 


GCAGAGC 


GCGCAAA 




71 


ACAGCTT 


GTCTGTA 


AGCGGAT 






TGTCGAA 


CAGACAT 


TCGCCTA 




141 


GTCGGGG 


CTGGCTT 


AACTATG 






CAGCCCC 


GACCGAA 


TTGATAC 


10 


211 


ACCGCAC 


AGATGCG 


TAAGGAG 






TGGCGTG 


TCTACGC 


ATTCCTC 




281 


GAAGGGC 


GATCGGT 


GCGGGCC 






CTTCCCG 


CTAGCCA 


CGCCCGG 




351 


TAAGTTG 


GGTAACG 


CCAGGGT 


15 




ATTCAAC 


CCATTGC 


GGTCCCA 




421 


CAGGTAT 


TCTAAAC 


TAGGAAT 






GTCCATA 


AGATTTG 


ATCCTTA 














491 


TTTGGTT 


TTTCATA 


ATCATAA 






AAACCAA 


AAAGTAT 


TAGTATT 




561 


GTAGTAT 


AGACTTA 


TACTTTG 


25 




CATCATA 


TCTGAAT 


ATGAAAC 




631 


ACAACAA 


TAATCAT 


CGTCGTC 






TGTTGTT 


ATTAGTA 


GCAGCAG 


30 


701 


ACATCAT 


CTGAATC 


AATAAAC 






TGTAGTA 


GACTTAG 


TTATTTG 




771 


TGCTCAT 


GATGTAC 








ACGAGTA 


CTACATG 


AAAAAAA 


35 












841 


ACTAGTC 


ATAAAAA 


CCCGGGA 






TGATCAG 


TATTTTT 


GGGCCCT 



CGGTGAT 
GCCACTA 
GCCGGGA 
CGGCCCT 
CGGCATC 
GCCGTAG 
AAAATAC 
TTTTATG 
TCTTCGC 
AGAAGCG 
TTTCCCA 
AAAGGGT 



GACGGTG 
CTGCCAC 
GCAGACA 
CGTCTGT 
AGAGCAG 
TCTCGTC 
CGCATCA 
GCGTAGT 
TATTACG 
ATAATGC 
GTCACGA 
CAGTGCT 



AAAACCT 
TTTTGGA 
AGCCCGT 
TCGGGCA 
ATTGTAC 
TAACATG 
GGCGCCA 
CCGCGGT 
CCAGCTG 
GGTCGAC 
CGTTGTA 
GCAACAT 



CTGACAC 
GACTGTG 
CAGGGCG 
GTCCCGC 
TGAGAGT 
ACTCTCA 
TTCGCCA 
AAGCGGT 
GCGAAAG 
CGCTTTC 
AAACGAC 
TTTGCTG 



ATGCAGC 
TACGTCG 
CGTCAGC 
GCAGTCG 
GCACCAT 
CGTGGTA 
TTCAGGC 
AAGTCCG 
GGGGATG 
CCCCTAC 
GGCCAGT 
CCGGTCA 



TCCCGGA 
AGGGCCT 
GGGTGTT 
CCCACAA 
ATGCGGT 
TACGCCA 
TGCGCAA 
ACGCGTT 
TGCTGCA 
ACGACGT 
GCCAAGC 
CGGTTCG 



GACGGTC 
CTGCCAG 
GGCGGGT 
CCGCCCA 
GTGAAAT 
CACTTTA 
CTGTTGG 
GACAACC 
AGGCGAT 
TCCGCTA 
TTGGCTG 
AACCGAC 



Left Arm 



AGATGAA ATTATGT 
TCTACTT TAATACA 

Left Arm 
TCTAACA ACATTTT 
AGATTGT TGTAAAA 

Left Arm 
TAACCAT AGTATAC 
ATTGGTA TCATATG 

Left Arm 
ATCTTCA TCTTCAT 
TAGAAGT AGAAGTA 

Left Arm 
ATAGAAC GGTATAG 
TATCTTG CCATATC 

Left Arm 
CATTATT TAGAAAT 
GTAATAA ATCTTTA 

Left Arm 



GCAAAGG 


AGATACC 


TTTAGAT 


ATGGATC 


TGATTTA 


CGTTTCC 


TCTATGG 


AAATCTA 


TACCTAG 


ACTAAAT 


CACTATA 


CTATACC 


TTCTTGC 


ACAAGTC 


GCCATTA 


GTGATAT 


GATATGG 


AAGAACG 


TGTTCAG 


CGGTAAT 


TTTAGCG 


CGTCATC 


TTCTTCA 


TCTAAAA 


CAGATTT 


AAATCGC 


GCAGTAG 


AAGAAGT 


AGATTTT 


GTCTAAA 


TAAAGTT 


TTCATAT 


TCAATAA 


CTTTCTT 


TTCTAAA 


ATTTCAA 


AAGTATA 


AGTTATT 


GAAAGAA 


AAGATTT 


AGCGTTA 


ATCTCCA 


TTGTAAA 


ATATACT 


AACGCGT 


TCGCAAT 


TAGAGGT 


AACATTT 


TATATGA 


TTGCGCA 


TATGCAT 


TTTAGAT 


CTTTATA 


AGCGGCC 


GTGATTA 


ATACGTA 


AAATCTA 


GAAATAT 


TCGCCGG 


CACTAAT 


GAGATAA 


AAACTAT 


ATCAGAG 


CAACCCC 


AACCAGC 


CTCTATT 


TTTGATA 


TAGTCTC 


GTTGGGG 


TTGGTCG 



CEA 

***Ile LeuAla ValGly ValLeuVal - 
ACTCCAA TCATGAT GCCGACA GTGGCCC CAGCTGA GAGACCA GGAGAAG TTCCAGA TGCAGAG ACTGTGA 
TGAGGTT AGTACTA CGGCTGT CACCGGG GTCGACT CTCTGGT CCTCTTC AAGGTCT ACGTCTC TGACACT 

CEA 

..Glylle Metlle GlyValThr AlaGly AlaSer LeuGlyPro SerThr GlySer AlaSerVal Thrlle- 
TGCTCTT GACTATG GAATTAT TGCGGCC AGTAGCC AAGTTAG AGACAAA ACAGGCA TAGGTCC CGTTATT 
ACGAGAA CTGATAC CTTAATA AGGCCGG TCATCGG TTCAATC TCTGTTT TGTCCGT ATCCAGG GCAATAA 

CEA 

«SerLys VallleSer AsnAsn ArgGly ThrAlaLeu AsnSer ValPhe CysAlaTyr ThrGly AsnAsn 
ATTTGGC GTGATTT TGGCGAT AAAGAGA ACTTGTG TGTGTTG CTGCGGT ATCCCAT TGATACG CCAAGAA 
TAAACCG CACTAAA ACCGCTA TTTCTCT TGAACAC ACACAAC GACGCCA TAGGGTA ACTATGC GGTTCTT 

CEA 

AsnProThr IleLys Alalle PheLeuVal GlnThr HisGln GlnProIle GlyAsn lleArg TrpSerTyr- 
TACTGCG GGGATGG GTTAGAG GCCGAGT GGCAGGA GAGGTTG AGGTCCG CTCCCGA AAGGTAA GACGAGT 
ATGACGC CCCTACC CAATCTC CGGCTCA CCGTCCT CTCCAAC TCCAGGC GAGGGCT TTCCATT CTGCTCA 

CEA 

. .GlnPro SerPro AsnSerAla SerHis CysSer LeuAsnLeu AspAla GlySer LeuTyrSer SerAsp- 
CTGGGGG GGAAATG ATGGGGG TGTCCGG CCCATAG AGGACAT CCAGGGT GACTGGG TCACTGC GGTTTGC 
GACCCCC CCTTTAC TACCCCC ACAGGCC GGGTATC TCCTGTA GGTCCCA CTGACCC AGTGACG CCAAACG 

CEA 

.ProPro Serllelle ProThr AspPro GlyTyrLeu ValAsp LeuThr ValProAsp SerArg AsnAla 
ACTCACT GAGTTCT GGATTCC ACATACA TAGGCTC TTGCGTC ATTTCTT GTGACAT TGAATAG AGTGAGG 
TGAGTGA CTCAAGA CCTAAGG TGTATGT ATCCGAG AACGCAG TAAAGAA CACTGTA ACTTATC TCACTGC 

CEA 

SerValSer AsnGln IleGly CysValTyr AlaArg AlaAsp AsnArgThr ValAsn PheLeu ThrLeuThr- 
GTCCTGT TGCCATT GGACAGC TGCAGCC TGGGACT GACTGGG AGGCTCT GACCATT TACCCAC CACAGGT 
CAGGACA ACGGTAA CCTGTCG ACGTCGG ACCCTGA CTGACCC TCCGAGA CTGGTAA ATGGGTG GTGTCCA 

CEA 

. .ArgAsn GlyAsn SerLeuGln LeuArg ProSer ValProLeu SerGln GlyAsn ValTrpTrp LeuTyr- 
AGGTTGT GTTCTGA GCCTCAG GTTCACA GGTGAAG GCCACAG CATCCTT GTCCTCC ACGGGTT TGGAGTT 
TCCAACA CAAGACT CGGAGTC CAAGTGT CCACTTC CGGTGTC GTAGGAA CAGGAGG TGCCCAA ACCTCAA 

CEA 

.ThrThr AsnGlnAla GluPro GluCys ThrPheAla ValAla AspLys AspGluVal ProLys SerAsn 
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1471 GTTGCTG GAGATGG AGGGCTT GGGCAGC TCCGCGG AAACAGT TATTGTT TTAACTG TAGTCCT GCTGTGA 

CAACGAC CTCTACC TCCCGAA CCCGTCG AGGCGCC TTTGTCA ATAACAA AATTGAC ATCAGGA CGACACT 

CEA 

AsnSerSer IleSer ProLys ProLeuGlu AlaSer ValThr IleThrLys ValThr ThrArg SerHisGly- 
5 1541 CCACTGG CTGAGTT ATTGGCC TGGCAAG TATAGAG TCCGCTG TTCTTCT CAGTTAT GTTGCTT ATAAATA 

GGTGACC GACTCAA TAACCGG ACCGTTC ATATCTC AGGCGAC AAGAAGA GTCAATA CAACGAA TATTTAT 

CEA 

. .SerAla SerAsn AsnAlaGln CysThr TyrLeu GlySerAsn LysGlu Thrlle AsnSerlle PheLeu- 
1611 ACTCTTG AGTATGC TGCTGAA TGTTTCC ATCAATC AGCCAGG AGTACTG TGCAGGG GGGTTGG ATGCTGC 

10 TGAGAAC TCATACG ACGACTT ACAAAGG TAGTTAG TCGGTCC TCATGAC ACGTCCC CCCAACC TACGACG 

CEA 

.GluGln ThrHisGln Glnlle AsnGly AsplleLeu TrpSer TyrGln AlaProPro AsnSer AlaAla 
1681 ATGGCAA GAAAGGC TCAAGTT CACGCCG GGACGGT AGTAGGT GTATGAT GGAGATA TAGTTGG GTCGTCT 

TACCGTT CTTTCCG AGTTCAA GTGCGGC CCTGCCA TCATCCA CATACTA CCTCTAT ATCAACC CAGCAGA 
15 CEA 

HisCysSer LeuSer LeuAsn ValGlyPro ArgTyr TyrTiir TyrSerPro Serlle ThrPro AspAspPro- 
1751 GGGCCAT ACAAAAC ATTAAGG ATAACAG GGTCGGA GTGATCA ACGGATA ATTCATT CTGAATG CCACACT 

CCCGGTA TGTTTTG TAATTCC TATTGTC CCAGCCT CACTAGT TGCCTAT TAAGTAA GACTTAC GGTGTGA 

CEA 

20 - .GlyTyr LeuVal AsnLeuIle ValPro AspSer HisAspVal SerLeu GluAsn GlnlleGly CysGlu- 

1821 CATAAGG TCCTACA TCATTGC GAGTAAC GGACAGG AGTGTCA ATGTGCG GTTATCA TTAGACA ACTGCAA 

GTATTCC AGGATGT AGTAACG CTCATTG CCTGTCC TCACAGT TACACGC CAATAGT AATCTGT TGACGTT 

CEA 

.TyrPro GlyValAsp AsnArg ThrVal SerLeuLeu ThrLeu ThrArg AsnAspAsn SerLeu GlnLeu 
25 1891 gcgtgggctaaccg gcaaact ttggtta TTGACCC accataa ataagtg gtatttt GAATCTC TGGCTCA 

CGCACCC GATTGGC CGTTTGA AACCAAT AACTGGG TGGTATT TATTCAC CATAAAA CTTAGAG ACCGAGT 

CEA 

ArgProSer ValPro LeuSer GlnAsnAsn ValTrp TrpLeu TyrThrThr AsnGln IleGlu ProGluCys- 
1961 CAAGTTA ATGCAAC TGCGTCC TCATCCT CAACTGG GTTAGAA TTGTTAC TAGTTAT GAATGGT TTTGGTG 

30 GTTCAAT TACGTTG ACGCAGG AGTAGGA GTTGACC CAATCTT AACAATG ATCAATA CTTACCA AAACCAC 

CEA 

..ThrLeu AlaVal AlaAspGlu AspGlu ValPro AsnSerAsn AsnSer Thrlle PheProLys ProPro- 
2031 GCTCATA CACGGTA ATCGTCG TCACGGT TGTGCGG TTGAGTC CGGTGTC GCTATTG TGAGCTT GGCACGT 

CGAGTAT GTGCCAT TAGCAGC AGTGCCA ACACGCC AACTCAG GCCACAG CGATAAC ACTCGAA CCGTGCA 
35 CEA 

.GluTyr ValThr lie ThrThr ValThr ThrArgAsn LeuGly ThrAsp SerAsnHis AlaGln CysThr 
2101 GTAGGAT CCACTAT TGTTCAC GGTAATA TTGGGAA TGAACAG TTCCTGG GTGGACT GTTGGAA AGTGCCA 

CATCCTA GGTGATA ACAAGTG CCATTAT AACCCTT ACTTGTC AAGGACC CACCTGA CAACCTT TCACGGT 

CEA 

40 TyrSerGly SerAsn AsnVal ThrlleAsn Prolle PheLeu GluGlnThr SerGln GlnPhe ThrGlyAsn- 

2171 TTGACAA ACCAGCT GTATTGG GCGGGAG GATTGCT AGCGGCA TGACAGC TCAGATT CAGATTT TCCCCTG 

AACTGTT TGGTCGA CATAACC CGCCCTC CTAACGA TCGCCGT ACTGTCG AGTCTAA GTCTAAA AGGGGAC 

CEA 

. .ValPhe TrpSer TyrGlnAla ProPro AsnSer AlaAlaHis CysSer LeuAsn LeuAsnGlu GlySer- 
45 2241 ATCTATA GCTTGTG TTTAGAG GGCTGAT TGTAGGA GCATCGG GTCCGTA AAGCACG TTGAGAA TCACTGA 

TAGATAT CGAACAC AAATCTC CCGACTA ACATCCT CGTAGCC CAGGCAT TTCGTGC AACTCTT AGTGACT 

CEA 

.ArgTyr SerThrAsn LeuPro Serlle ThrProAla AspPro GlyTyr LeuValAsn Leulle ValSer 
2311 ATCAGAC CTCCTGG CGCTGAC TGGATTT TGGGTTT CGCATTT GTAGCTT GCTGTGT CGTTCCT GGTCACG 

50 TAGTCTG GAGGACC GCGACTG ACCTAAA ACCCAAA GCGTAAA CATCGAA CGACACA GCAAGGA CCAGTGC 

CEA 

AspSerArg ArgAla SerVal ProAsnGln ThrGlu CysLys TyrSerAla ThrAsp AsnArg ThrValAsn- 
2381 TTAAACA GGGTCAG AGTTCTA TTTCCGT TGCTGAG TTGGAGT CTAGGGG ACACAGG CAGGGAC TGGTTGT 

AATTTGT CCCAGTC TCAAGAT AAAGGCA ACGACTC AACCTCA GATCCCC TGTGTCC GTCCCTG ACCAACA 
55 CEA 

..PheLeu ThrLeu ThrArgAsn GlyAsn SerLeu GlnLeuArg ProSer ValPro LeuSerGln AsnAsn- 
2451 TCACCCA CCAGAGA TATGTTG CGTCTTG AGTTTCG GGCTCGC ATGTAAA AGCGACG GCATCTT TGTCTTC 

AGTGGGT GGTCTCT ATACAAC GCAGAAC TCAAAGC CCGAGCG TACATTT TCGCTGC CGTAGAA ACAGAAG 

CEA 

60 .ValTrp TrpLeuTyr ThrAla AspGln ThrGluPro GluCys ThrPhe AlaValAla AspLys AspGlu 

2521 GACAGGC TTACTAT TATTGGA GCTAATA GAAGGCT TAGGGAG TTCCGGG TATACCC GGAACTG GCCAGTT 

CTGTCCG AATGATA ATAACCT CGATTAT CTTCCGA ATCCCTC AAGGCCC ATATGGG CCTTGAC CGGTCAA 

CEA 

ValProLys SerAsn AsnSer SerlleSer ProLys ProLeu GluProTyr ValArg PheGln GlyThrAla- 
65 2591 GCTTCTT CATTCAC AAGATCT GACTTTA TGACGTG TAGGGTG TAGAATC CTGTGTC ATTCTGG ATGATGT 

CGAAGAA GTAAGTG TTCTAGA CTGAAAT ACTGCAC ATCCCAC ATCTTAG GACACAG TAAGACC TACTACA 

CEA 

..GluGlu AsnVal LeuAspSer Lyslle ValHis LeuThrTyr PheGly ThrAsp AsnGlnlle IleAsn- 
2661 TCTGGAT CAGCAGG GATGCAT TGGGGTA TATTATC TCTCGAC CACTGTA TGCGGGC CCTGGGG TAGCTTG 

70 AGACCTA GTCGTCC CTACGTA ACCCCAT ATAATAG AGAGCTG GTGACAT ACGCCCG GGACCCC ATCGAAC 

CEA 

.Glnlle LeuLeuSer AlaAsn ProTyr IlelleGlu ArgGly SerTyr AlaProGly ProThr AlaGln 
2731 TTGAGTT CCTATTA CATATCC TATAATT TGACGGT TGCCATC CACTCTT TCACCTT TGTACCA GCTGTAG 

AACTCAA GGATAAT GTATAGG ATATTAA ACTGCCA ACGGTAG GTGAGAA AGTGGAA ACATGGT CGACATC 
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CEA 

GlnThrGly IleVal TyrGly IlelleGln ArgAsn GlyAsp ValArgGlu GlyLys TyrTrp SerTyrGly 
CCAAAAA GATGCTG GGGCAGA TTGTGGA CAAGTAG AAGCACC TCCTTCC CCTCTGC GACATTG AACGGCG 
GGTTTTT CTACGAC CCCGTCT AACACCT GTTCATC TTCGTGG AGGAAGG GGAGACG CTGTAAC TTGCCGC 

CEA 

. .PheLeu HisGln ProLeuAsn HisVal LeuLeu LeuValGlu LysGly GluAla ValAsnPhe ProThr- 
TGGATTC AATAGTG AGCTTGG CAGTGGT GGGCGGG TTCCAGA AGGTTAG AAGTGAG GCTGTGA GCAGGAG 
ACCTAAG TTATCAC TCGAACC GTCACCA CCCGCCC AAGGTCT TCCAATC TTCACTC CGACACT CGTCCTC 

CEA 

.SerGlu IleThrLeu LysAla ThrThr ProProAsn TrpPhe ThrLeu LeuSerAla -ThrLeu LeuLeu 
CCTCTGC CAGGGGA TGCACCA TCTGTGG GGAGGGG CCGAGGG AGACTCC ATTATTT ATATTCC AAAAAAA 
GGAGACG GTCCCCT ACGTGGT AGACACC CCTCCCC GGCTCCC TCTGAGG TAATAAA TATAAGG TTTTTTT 



E/L Promoter 



CEA 

ArgGlnTrp Prolle CysTrp ArgHisPro ProAla SerPro SerGluMet 

H6 promoter 



AAAAATA AAATTTC AATTTTT GTCGACC TGCAGCT CGACGGA TCCCCCC GGGTTCT TTATTCT ATACTTA 
TTTTTAT TTTAAAG TTAAAAA CAGCTGG ACGTCGA GCTGCCT AGGGGGG CCCAAGA AATAAGA TATGAAT 



E/Ii Promoter 

H6 promoter 



AAAAGTG AAAATAA ATACAAA GGTTCTT GAGGGTT GTGTTAA ATTGAAA GCGAGAA ATAATCA TAAATTA 
TTTTCAC TTTTATT TATGTTT CCAAGAA CTCCCAA CACAATT TAACTTT CGCTCTT TATTAGT ATTTAAT 

p53 



H6 promoter 



MetGlu GluProGln SerAsp ProSer ValGluPro 
TTTCATT ATCGCGA TATCCGT TAAGTTT GTATCGT AATGGAG GAGCCGC AGTCAGA TCCTAGC GTCGAGC 
AAAGTAA TAGCGCT ATAGGCA ATTCAAA CATAGCA TTACCTC CTCGGCG TCAGTCT AGGATCG CAGCTCG 

p53 



. . ProLeu SerGln GluThrPhe SerAsp LeuTrp LysLeuLeu ProGlu AsnAsn ValLeuSer ProLeu • 
CCCCTCT GAGTCAG GAAACAT TTTCAGA CCTATGG AAACTAC TTCCTGA AAACAAC GTTCTGT CCCCCTT 
GGGGAGA CTCAGTC CTTTGTA AAAGTCT GGATACC TTTGATG AAGGACT TTTGTTG CAAGACA GGGGGAA 

p53 



.ProSer GlnAlaMet AspAsp LeuMet LeuSerPro AspAsp IleGlu GlnTrpPhe ThrGlu AspPro 
GCCGTCC CAAGCAA TGGATGA TTTGATG CTGTCCC CGGACGA TATTGAA CAATGGT TCACTGA AGACCCA 
CGGCAGG GTTCGTT ACCTACT AAACTAC GACAGGG GCCTGCT ATAACTT GTTACCA AGTGACT TCTGGGT 

p53 



GlyProAsp GluAla ProArg Met ProGlu AlaAla ProPro ValAlaPro AlaPro AlaAla ProThrPro 
GGTCCAG ATGAAGC TCCCAGA ATGCCAG AGGCTGC TCCCCCC GTGGCCC CTGCACC AGCAGCT CCTACAC 
CCAGGTC TACTTCG AGGGTCT TACGGTC TCCGACG AGGGGGG CACCGGG GACGTGG TCGTCGA GGATGTG 

p53 



. .AlaAla ProAla ProAlaPro SerTrp ProLeu SerSerSer ValPro SerGln LysThrTyr GlnGly 
CGGCGGC CCCTGCA CCAGCCC CCTCCTG GCCCCTG TCATCTT CTGTCCC TTCCCAG AAAACCT ACCAGGG 
GCCGCCG GGGACGT GGTCGGG GGAGGAC CGGGGAC AGTAGAA GACAGGG AAGGGTC TTTTGGA TGGTCCC 

P 53 



.SerTyr GlyPheArg LeuGly PheLeu HisSerGly ThrAla LysSer ValThrCys ThrTyr SerPro 
CAGCTAC GGTTTCC GTCTGGG CTTCTTG CATTCTG GGACAGC CAAGTCT GTGACTT GCACGTA CTCCCCT 
GTCGATG CCAAAGG CAGACCC GAAGAAC GTAAGAC CCTGTCG GTTCAGA CACTGAA CGTGCAT GAGGGGA 

p53 



AlaLeuAsn LysMet PheCys GlnLeuAla LysThr CysPro ValGlnLeu TrpVal AspSer ThrProPro 
GCCCTCA ACAAGAT GTTTTGC CAACTGG CCAAGAC CTGCCCT GTGCAGC TGTGGGT TGATTCC ACACCCC 
CGGGAGT TGTTCTA CAAAACG GTTGACC GGTTCTG GACGGGA CACGTCG ACACCCA ACTAAGG TGTGGGG 

p53 



..ProGly ThrArg ValArgAla MetAla IleTyr LysGlnSer GlnHis MetThr GluValVal ArgArg- 
CGCCCGG CACCCGC GTCCGCG CCATGGC CATCTAC AAGCAGT CACAGCA CATGACG GAGGTTG TGAGGCG 
GCGGGCC GTGGGCG CAGGCGC GGTACCG GTAGATG TTCGTCA GTGTCGT GTACTGC CTCCAAC ACTCCGC 

P 53 



.CysPro HisHisGlu ArgCys SerAsp SerAspGly LeuAla ProPro GlnHis Leu IleArg ValGlu 
CTGCCCC CACCATG AGCGCTG CTCAGAT AGCGATG GTCTGGC CCCTCCT CAGCATC TTATCCG AGTGGAA 
GACGGGG GTGGTAC TCGCGAC GAGTCTA TCGCTAC CAGACCG GGGAGGA GTCGTAG AATAGGC TCACCTT 
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P 53 



10 



15 



20 



25 



30 



3781 



3851 



3921 



3991 



4061 



4131 



4201 



GlyAsnLeu ArgVal GluTyr LeuAspAsp ArgAsn ThrPhe ArgHisSer Valval ValPro TyrGluPro- 
GGAAATT TGCGTGT GGAGTAT TTGGATG ACAGAAA CACTTTT CGACATA GTGTGGT GGTGCCC TATGAGC 
CCTTTAA ACGCACA CCTCATA AACCTAC TGTCTTT GTGAAAA GCTGTAT CACACCA CCACGGG ATACTCG 

P 53 

. .ProGlu ValGly SerAspCys ThrThr IleHis TyrAsnTyr MetCys AsnSer SerCysMet GlyGly- 
CGCCTGA GGTTGGC TCTGACT GTACCAC CATCCAC TACAACT ACATGTG TAACAGT TCCTGCA TGGGCGG 
GCGGACT CCAACCG AGACTGA CATGGTG GTAGGTG ATGTTGA TGTACAC ATTGTCA AGGACGT ACCCGCC 

p53 

.MetAsn ArgArgPro IleLeu Thrlle IleThrLeu GluAsp SerSer GlyAsnLeu LeuGly ArgAsn 
CATGAAC CGGAGGC CCATCCT CACCATC ATCACAC TGGAAGA CTCCAGT GGTAATC TACTGGG ACGGAAC 
GTACTTG GCCTCCG GGTAGGA GTGGTAG TAGTGTG ACCTTCT GAGGTCA CCATTAG ATGACCC TGCCTTG 

p53 

SerPheGlu ValArg ValCys AlaCysPro GlyArg AspArg ArgThrGlu GluGlu AsnLeu ArgLysLys • 
AGCTTTG AGGTGCG TGTTTGT GCCTGTC CTGGGAG AGACCGG CGCACAG AGGAAGA GAATCTC CGCAAGA 
TCGAAAC TCCACGC ACAAACA CGGACAG GACCCTC TCTGGCC GCGTGTC TCCTTCT CTTAGAG GCGTTCT 

p53 

. .GlyGlu ProHis HisGluLeu ProPro GlySer ThrLysArg AlaLeu ProAsn AsnThrSer SerSer- 
AAGGGGA GCCTCAC CACGAGC TGCCCCC AGGGAGC ACTAAGC GAGCACT GCCCAAC AACACCA GCTCCTC 
TTCCCCT CGGAGTG GTGCTCG ACGGGGG TCCCTCG TGATTCG CTCGTGA CGGGTTG TTGTGGT CGAGGAG 

p53 

.ProGln ProLysLys LysPro LeuAsp GlyGluTyr PheThr LeuGln IleArgGly ArgGlu ArgPhe 
TCCCCAG CCAAAGA AGAAACC ACTGGAT GGAGAAT ATTTCAC CCTTCAG ATCCGTG GGCGTGA GCGCTTC 
AGGGGTC GGTTTCT TCTTTGG TGACCTA CCTCTTA TAAAGTG GGAAGTC TAGGCAC CCGCACT CGCGAAG 

p53 

GluMetPhe ArgGlu LeuAsn GluAlaLeu GluLeu LysAsp AlaGlnAla GlyLys GluPro GlyGlySer- 
GAGATGT TCCGAGA GCTGAAT GAGGCCT TGGAACT CAAGGAT GCCCAGG CTGGGAA GGAGCCA GGGGGGA 



35 




CTCTACA AGGCTCT 


CGACTTA 


CTCCGGA 


ACCTTGA 
p53 


GTTCCTA 


CGGGTCC 


GACCCTT 


CCTCGGT 


CCCCCCT 






. .ArgAla HisSer 


SerHisLeu LysSer LysLys GlyGlnSer ThrSer ArgHis 


LysLysLeu MetPhe 




4271 


GCAGGGC 


TCACTCC 


AGCCACC 


TGAAGTC 


CAAAAAG 


GGTCAGT 


CTACCTC 


CCGCCAT 


AAAAAAC 


TCATGTT 


40 




CGTCCCG 


AGTGAGG 
p53 


TCGGTGG 


ACTTCAG 


GTTTTTC 


CCAGTCA 


GATGGAG 


GGCGGTA 


TTTTTTG 


AGTACAA 






.LysThr GluGlyPro AspSer Asp*** 










ACCGGAT 


CCTTTTT 




4341 


CAAGACA GAAGGGC 


CTGACTC 


AGACTGA 


ACGCGTT 


TTTTATC 


CCGGGCT 


CGAGGGT 


45 




•GTTCTGT 


CTTCCCG 


GACTGAG 


TCTGACT 


TGCGCAA 


AAAATAG 


GGCCCGA 


GCTCCCA 


TGGCCTA 


GGAAAAA 


4411 


ATAGCTA ATTAGTC 


ACGTACC 


TTTGAGA 


GTACCAC 


TTCAGCT 


ACCTCTT 


TTGTGTC 


TCAGAGT 


AACTTTC 






TATCGAT 


TAATCAG 


TGCATGG 


AAACTCT 


CATGGTG 


AAGTCGA 


TGGAGAA 


AACACAG 


AGTCTCA 


TTGAAAG 




















Right Arm 




50 


4481 


TTTAATC 


AATTCCA 


AAACAGT 


ATATGAT 


TTTCCAT 


TTCTTTC 


AAAGATG 


TAGTTTA 


CATCTGC 


TCCTTTG 




AAATTAG 


TTAAGGT 


TTTGTCA 


TATACTA 


AAAGGTA 


AAGAAAG 


TTTCTAC 


ATCAAAT 


GTAGACG 


AGGAAAC 












Right Arm 














4551 


TTGAAAA 


GTAGCCT 


GAGCACT 


TCTTTTC 


TACCATG 


AATTACA 


GCTGGCA 


AGATCAA 


TTTTTCC 


CAGTTCT 




AACTTTT 


CATCGGA 


CTCGTGA 


AGAAAAG 


ATGGTAC 


TTAATGT 


CGACCGT 


TCTAGTT 


AAAAAGG 


GTCAAGA 


55 










Right Arm 












4621 


GGACATT 


TTATTTT 


TTTTAAG 


TAGTGTG 


CTACATA 


TTTCAAT 


ATTTCCA 


GATTGTA 


CAGCGAT 


CATTAAA 




CCTGTAA 


AATAAAA 


AAAATTC 


ATCACAC 


GATGTAT 


AAAGTTA 


TAAAGGT 


CTAACAT 


GTCGCTA 


GTAATTT 












Right Arm 














4691 


GGAGTAC 


GTCCCAT 


GTTATCC 


AGCAAGT 


CAGTATC 


AGCACCT 


TTGTTCA 


ATAGAAG 


TTTAACC 


ATTGTTA 


60 




CCTCATG 


CAGGGTA 


CAATAGG 


TCGTTCA 


GTCATAG 


TCGTGGA 


AACAAGT 


TATCTTC 


AAATTGG 


TAACAAT 










Right Arm 














4761 


AATTTTT 


ATTTGAT 


ACGGCTA 


TATGTAG 


AGGAGTT 


AACCGAT 


CCGTGTT 


TGAAATA 


TCTACAT 


CCGCCGA 




TTAAAAA 


TAAACTA 


TGCCGAT 


ATACATC 


TCCTCAA 


TTGGCTA 


GGCACAA 


ACTTTAT 


AGATGTA 


GGCGGCT 












Right Arm 












65 


4831 


ATGAGCC 


AATAGAA 


GTTTAAC 


CAAATTA 


ACTTTGT 


TAAGGTA 


AGCTGCC 


AAACACA 


AAGGAGT 


AAAGCCT 




TACTCGG 


TTATCTT 


CAAATTG 


GTTTAAT 


TGAAACA 


ATTCCAT 


TCGACGG 


TTTGTGT 


TTCCTCA 


TTTCGGA 












Right Arm 














4901 


CCGCTGT 


AAAGAAC 


ATTGTTT 


ACATAGT 


TATTCTT 


CAACAGA 


TCTTTCA 


CTATTTT 


GTAGTCG 


TCTCTCA 






GGCGACA 


TTTCTTG 


TAACAAA 


TGTATCA 


ATAAGAA 


GTTGTCT 


AGAAAGT 


GATAAAA 


CATCAGC 


AGAGAGT 


70 










Right Arm 














4971 


ACACCGC 


ATCATGC 


AGACAAG 


AAGTTGT 


GCATTCA 


GTAACTA 


CAGGTTT 


AGCTCCA 


TACCTCA 


TCAAGAT 






TGTGGCG 


TAGTACG 


TCTGTTC 


TTCAACA 


CGTAAGT 


CATTGAT 


GTCCAAA 


TCGAGGT 


ATGGAGT 


AGTTCTA 












Right Arm 














5041 


TTTTATA 


GCCTCGG 


TATTCTT 


GAACATT 


ACAGCCA 


TTTCAAG 


AGGAGAT 


TGTAGAG 


TACCATA 


TTCCGTG 
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55 



60 



65 



70 



sin 

5181 
5251 
5321 
5391 
5461 
5531 
5601 
5671 
5741 
5811 
5881 
5951 

6021 
6091 
6161 
6231 
6301 
6371 
6441 
6511 
6581 
6651 
6721 
6791 
6861 
6931 
7001 
7071 



TTAGGGT 
AATCCCA 



CGTATAT 
GCATATA 



TTTTTCA 
AAAAAGT 



AATATCT 
TTATAGA 



TCATAGT 
AGTATCA 



GATTTTT 
CTAAAAA 



TAAGTTA 
ATTCAAT 



TGTCGTA 
ACAGCAT 



AGTAATT 
TCATTAA 



GTAAAAT 
CATTTTA 



AATTATG 
TTAATAC 



TTTGTAG 
AAACATC 



ATAACAT 
TATTGTA 



AAATTGT 
TTTAACA 
TAATGAG 
ATTACTC 
GCCAGCT 
CGGTCGA 
CTCGCTC 
GAGCGAG 
ATACGGT 
TATGCCA 
GGAACCG 
CCTTGGC 
TCGACGC 
AGCTGCG 
TCCCTCG 
AGGGAGC 
GCGTGGC 
CGCACCG 
CTGTGTG 
GACACAC 
CCGGTAA 
GGCCATT 
GCGGTGC 
CGCCACG 
CGCTCTG 
GCGAGAC 
GGTAGCG 
CCATCGC 
TGATCTT 
ACTAGAA 
ATCAAAA 
TAGTTTT 



CGAATCC 
GCTTAGG 

GTAAGCC 
CATTCGG 

TTTTCTT 
AAAAGAA 

GAAACAC 
CTTTGTG 

TTACAAA 
AATGTTT 

GTATAAT 
CATATTA 

CATCTGT 
GTAGACA 

CCCAATT 
GGGTTAA 

GTAAAGA 
CATTTCT 

AATTACG 
TTAATGC 

TTTATTT 
AAATAAA 

TATTTGA 
ATAAACT 

TTAACAT 
AATTGTA 

Right 
TATCCGC 
ATAGGCG 
TGAGCTA 
ACTCGAT 
GCATTAA 
CGTAATT 
ACTGACT 
TGACTGA 
TATCCAC 
ATAGGTG 
TAAAAAG 
ATTTTTC 
TCAAGTC 
AGTTCAG 
TGCGCTC 
ACGCGAG 
GCTTTCT 
CGAAAGA 
CACGAAC 
GTGCTTG 
GACACGA 
CTGTGCT 
TACAGAG 
ATGTCTC 
CTGAAGC 
GACTTCG 
GTGGTTT 
CACCAAA 
TTCTACG 
AAGATGC 
AGGATCT 
TCCTAGA 



ATTGTCC 
TAACAGG 

ATCTTGA 
TAGAACT 

CATGATT 
GTACTAA 

ATACATA 
TATGTAT 

TTCGCAG 
AAGCGTC 

ATAACTG 
TATTGAC 

CAAATCC 
GTTTAGG 

ATCATGA 
TAGTACT 

GTATACG 
CATATGC 

ATATTAC 
TATAATG 

GTGTATA 
CACATAT 

ATCCTTT 
TAGGAAA 

TCAGAAT 
AGTCTTA 

Arm 

TCACAAT 
AGTGTTA 
ACTCACA 
TGAGTGT 
TGAATCG 
ACTTAGC 
CGCTGCG 
GCGACGC 
AGAATCA 
TCTTAGT 
GCCGCGT 
CGGCGCA 
AGAGGTG 
TCTCCAC 
TCCTGTT 
AGGACAA 
CATAGCT 
GTATCGA 
CCCCCGT 
GGGGGCA 
CTTATCG 
GAATAGG 
TTCTTGA 
AAGAACT 
CAGTTAC 
GTCAATG 
TTTTGTT 
AAAACAA 
GGGTCTG 
CCCAGAC 
TCACCTA 
AGTGGAT 



CTTGTAA TGTCGGT 

Right Arm 
AAAAACC TATTTAG 
TTTTTGG ATAAATC 

Right Arm 
ATGTATA ATTTTGT 
TACATAT TAAAACA 

Right Arm 
AATATAG TTTACGG 
TTATATC AAATGCC 

Right Arm 
AAACATG GAAGAAT 
TTTGTAC CTTCTTA 

Right Arm 
TAATCTT CATCTTT 
ATTAGAA GTAGAAA 

Right Arm 
GTATCCT ATCTTCC 
CATAGGA TAGAAGG 

Right Arm 
ATCTTTC CAACTGA 
TAGAAAG GTTGACT 

Right Arm 
CAAGATT CTCTTAA 
GTTCTAA GAGAATT 

Right Arm 
ATAACAG TATAGAT 
TATTGTC ATATCTA 

Right Arm 
ATTTCCT TTTATTA 
TAAAGGA AAATAAT 

Right Arm 
TTTAAAG GGTCGTT 
AAATTTC GCAGCAA 

Right Arm 
CTTTAAA TGGATTA 
GAAATTT ACCTAAT 

Right Arm 
TGCGGCC GCAATTC 
ACGCCGG CGTTAAG 



AAAGTTC TCCTCTA ACATCTC ATGGTAT AAGGCAC 



AGATGCA TTGTCAT TATCCAT GATAGCC 
TCTACGT AACAGTA ATAGGTA CTATCGG 

TGTTTTC AACAACC GCTCGTG AACAGCT 
ACAAAAG TTGTTGG CGAGCAC TTGTCGA 

AATATAA GTATACA AAAAGTT TATAGTA 
TTATATT CATATGT TTTTCAA ATATCAT 

TACACGA TGTCGTT GAGATAA ATGGCTT 
ATGTGCT ACAGCAA CTCTATT TACCGAA 

TACGAAT ATTGCAG AATCTGT TTTATCC 
ATGCTTA TAACGTC TTAGACA AAATAGG 

GATAGAA TGCTGTT ATTTAAC ATTTTTG 
CTATCTT ACGACAA TAAATTG TAAAAAC 

CTTTATG TAACGAT GCGAAAT AGCATTT 
GAAATAC ATTGCTA CGCTTTA TCGTAAA 

ATAGGTA ATCTTAT TATCTCT TGCATAT 
TATGCAT TAGAATA ATAGAGA ACGTATA 

ATACACG TGATATA AATATTT AACCCCA 
TATGTGC ACTATAT TTATAAA TTGGGGT 

TTTTTAT GTTTTAG TTATTTG TTAGGTT 
AAAAATA CAAAATC AATAAAC AATCCAA 

AAGAATA AGCTTAG TTAACAT ATTATCG 
TTCTTAT TCGAATC AATTGTA TAATAGC 

TTTTTCC AATGCAT ATTTATA GCTTCAT 
AAAAAGG TTACGTA TAAATAT CGAAGTA 

AATTCGT AATCATG GTCATAG CTGTTTC 
TTAAGCA TTAGTAC CAGTATC GACAAAG 



TCACAGA 
AGTGTCT 



TCTATAC 
AGATATG 



ATCTCAT 
TAGAGTA 



TTTATTG 
AAATAAC 



AACCAGT 
TTGGTCA 



CACCTAT 
GTGGATA 



ATCACTA 
TAGTGAT 



TCGTAAT 
AGCATTA 



TTCCTGA 
AAGGACT 



ATACAAA 
TATGTTT 



CTTAGGT 
GAATCCA 



CCAAAGT 
GGTTTCA 



CTGTGTG 
GACACAC 



TCCACAC 
AGGTGTG 
TTAATTG 
AATTAAC 
GCCAACG 
CGGTTGC 
CTCGGTC 
GAGCCAG 
GGGGATA 
CCCCTAT 
TGCTGGC 
ACGACCG 
GCGAAAC 
CGCTTTG 
CCGACCC 
GGCTGGG 
CACGCTG 
GTGCGAC 
TCAGCCC 
AGTCGGG 
CCACTGG 
GGTGACC 
AGTGGTG 
TCACCAC 
CTTCGGA 
GAAGCCT 
TGCAAGC 
ACGTTCG 
ACGCTCA 
TGCGAGT 
GATCCTT 
CTAGGAA 



AACATAC 
TTGTATG 
CGTTGCG 
GCAACGC 
CGCGGGG 
GCGCCCC 
GTTCGGC 
CAAGCCG 
ACGCAGG 
TGCGTCC 
GTTTTTC 
CAAAAAG 
CCGACAG 
GGCTGTC 
TGCCGCT 
ACGGCGA 
TAGGTAT 
ATCCATA 
GACCGCT 
CTGGCGA 
CAGCAGC 
GTCGTCG 
GCCTAAC 
CGGATTG 
AAAAGAG 
TTTTCTC 
AGCAGAT 
TCGTCTA 
GTGGAAC 
CACCTTG 
TTAAATT 
AATTTAA 



GAGCCGG 
CTCGGCC 
CTCACTG 
GAGTGAC 
AGAGGCG 
TCTCCGC 
TGCGGCG 
ACGCCGC 
AAAGAAC 
TTTCTTG 
CATAGGC 
GTATCCG 
GACTATA 
CTGATAT 
TACCGGA 
ATGGCCT 
CTCAGTT 
GAGTCAA 
GCGCCTT 
CGCGGAA 
CACTGGT 
GTGACCA 
TACGGCT 
ATGCCGA 
TTGGTAG 
AACCATC 
TACGCGC 
ATGCGCG 
GAAAACT 
CTTTTGA 
AAAAATG 
TTTTTAC 



AAGCATA 
TTCGTAT 
CCCGCTT 
GGGCGAA 
GTTTGCG 
CAAACGC 
AGCGGTA 
TCGCCAT 
ATGTGAG 
TACACTC 
TCCGCCC 
AGGCGGG 
AAGATAC 
TTCTATG 
TACCTGT 
ATGGACA 
CGGTGTA 
GCCACAT 
ATCCGGT 
TAGGCCA 
AACAGGA 
TTGTCCT 
ACACTAG 
TGTGATC 
CTCTTGA 
GAGAACT 
AGAAAAA 
TCTTTTT 
CACGTTA 
GTGCAAT 
AAGTTTT 
TTCAAAA 



AAGTGTA 
TTCACAT 
TCCAGTC 
AGGTCAG 
TATTGGG 
ATAACCC 
TCAGCTC 
AGTCGAG 
CAAAAGG 
GTTTTCC 
CCCTGAC 
GGGACTG 
CAGGCGT 
GTCCGCA 
CCGCCTT 
GGCGGAA 
GGTCGTT 
CCAGCAA 
AACTATC 
TTGATAG 
TTAGCAG 
AATCGTC 
AAGGACA 
TTCCTGT 
TCCGGCA 
AGGCCGT 
AAGGATC 
TTCCTAG 
AGGGATT 
TCCCTAA 
AAATCAA 
TTTAGTT 



AAGCCTG 
TTCGGAC 
GGGAAAC 
CCCTTTG 
CGCTCTT 
GCGAGAA 
ACTCAAA 
TGAGTTT 
CCAGCAA 
GGTCGTT 
GAGCATC 
CTCGTAG 
TTCCCCC 
AAGGGGG 
TCTCCCT 
AGAGGGA 
CGCTCCA 
GCGAGGT 
GTCTTGA 
CAGAACT 
AGCGAGG 
TCGCTCC 
GTATTTG 
CATAAAC 
AACAAAC 
TTGTTTG 
TCAAGAA 
AGTTCTT 
TTGGTCA 
AACCAGT 
TCTAAAG 
AGATTTC 



GGGTGCC 
CCCACGG 
CTGTCGT 
GACAGCA 
CCGCTTC 
GGCGAAG 
GGCGGTA 
CCGCCAT 
AAGGCCA 
TTCCGGT 
ACAAAAA 
TGTTTTT 
TGGAAGC 
ACCTTCG 
TCGGGAA 
AGCCCTT 
AGCTGGG 
TCGACCC 
GTCCAAC 
CAGGTTG 
TATGTAG 
ATACATC 
GTATCTG 
CATAGAC 
CACCGCT 
GTGGCGA 
GATCCTT 
CTAGGAA 
TGAGATT 
ACTCTAA 
TATATAT 
ATATATA 
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7141 

7211 
7281 
7351 
7421 
7491 
7561 
7631 
7701 
7771 
7841 
7911 
7981 

8051 
8121 
8191 



GAGTAAA CTTGGTC TGACAGT TACCAAT GCTTAAT CAGTGAG GCACCTA TCTCAGC GATCTGT CTATTTC 
CTCATTT GAACCAG ACTGTCA ATGGTTA CGAATTA GTCACTC CGTGGAT AGAGTCG CTAGACA GATAAAG 



GTTCATC 
CAAGTAG 



CAGTGCT 
GTCACGA 



GGAAGGG 
CCTTCCC 



CATAGTT 
GTATCAA 



GCAATGA 
CGTTACT 



CCGAGCG 
GGCTCGC 



AAGCTAG 
TTCGATC 



GTCACGC 
CAGTGCG 



CCCATGT 
GGGTACA 



TGTTATC 
ACAATAG 



TGTGACT 
ACACTGA 



GCGTCAA 
CGCAGTT 



CGGGGCG 
GCCCCGC 



CTGATCT 
GACTAGA 



AAAAAGG 
TTTTTCC 



AGTAAGT 
TCATTCA 



TCGTCGT 
AGCAGCA 



GCCTGAC 
CGGACTG 



TACCGCG 
ATGGCGC 



CAGAAGT 
GTCTTCA 



AGTTCGC 
TCAAGCG 



TTGGTAT 
AACCATA 



TGTGCAA 
ACACGTT 



ACTCATG 
TGAGTAC 



GGTGAGT 
CCACTCA 



TACGGGA 
ATGCCCT 



AAAACTC 
TTTTGAG 



TCAGCAT 
AGTCGTA 



GAATAAG 
CTTATTC 



AAAAGCG 
TTTTCGC 



GTTATGG 
CAATACC 



ACTCAAC 
TGAGTTG 



TAATACC 
ATTATGG 



TCAAGGA 
AGTTCCT 



CTTTTAC 
GAAAATG 



GGCGACA 
CCGCTGT 



Amp resistance gene 
TCCCCGT CGTGTAG ATAACTA CGATACG GGAGGGC 
AGGGGCA GCACATC TATTGAT GCTATGC CCTCCCG 

Amp resistance gene 
AGACCCA CGCTCAC CGGCTCC AGATTTA TCAGCAA 
TCTGGGT GCGAGTG GCCGAGG TCTAAAT AGTCGTT 

Amp resistance gene 
GGTCCTG CAACTTT ATCCGCC TCCATCC AGTCTAT 
CCAGGAC GTTGAAA TAGGCGG AGGTAGG TCAGATA 

Amp resistance gene 
CAGTTAA TAGTTTG CGCAACG TTGTTGC CATTGCT 
GTCAATT ATCAAAC GCGTTGC AACAACG GTAACGA 

Amp resistance gene 
GGCTTCA TTCAGCT CCGGTTC CCAACGA TCAAGGC 
CCGAAGT AAGTCGA GGCCAAG GGTTGCT AGTTCCG 

Amp resistance gene 
GTTAGCT CCTTCGG TCCTCCG ATCGTTG TCAGAAG 
CAATCGA GGAAGCC AGGAGGC TAGCAAC AGTCTTC 

Amp resistance gene 
CAGCACT GCATAAT TCTCTTA CTGTCAT GCCATCC 
GTCGTGA CGTATTA AGAGAAT GACAGTA CGGTAGG 

Amp resistance gene 
CAAGTCA TTCTGAG AATAGTG TATGCGG CGACCGA 
GTTCAGT AAGACTC TTATCAC ATACGCC GCTGGCT 

Amp resistance gene 
GCGCCAC ATAGCAG AACTTTA AAAGTGC TCATCAT 
CGCGGTG TATCGTC TTGAAAT TTTCACG AGTAGTA 

Amp resistance gene 
TCTTACC GCTGTTG AGATCCA GTTGGAT GTAACCC 
AGAATGG CGACAAC TCTAGGT CAAGCTA CATTGGG 

Amp resistance gene 
TTTCACC AGCGTTT CTGGGTG AGCAAAA ACAGGAA 
AAAGTGG TCGCAAA GACCCAC TCGTTTT TGTCCTT 

Amp resistance gene 
CGGAAAT GTTGAAT ACTCATA CTCTTCC TTTTTCA 
GCCTTTA CAACTTA TGAGTAT GAGAAGG AAAAAGT 



TTACCAT 
AATGGTA 



TAAACCA 
ATTTGGT 



TAATTGT 
ATTAACA 



ACAGGCA 
TGTCCGT 



GAGTTAC 
CTCAATG 



TAAGTTG 
ATTCAAC 



GTAAGAT 
CATTCTA 



GTTGCTC 
CAACGAG 



TGGAAAA 
ACCTTTT 



ACTCGTG 
TGAGCAC 



GGCAAAA 
CCGTTTT 



ATATTAT 
TATAATA 



CTGGCCC 
GACCGGG 



GCCAGCC 
CGGTCGG 



TGCCGGG 
ACGGCCC 



TCGTGGT 
AGCACCA 



ATGATCC 
TACTAGG 



GCCGCAG 
CGGCGTC 



GCTTTTC 
CGAAAAG 



TTGCCCG 
AACGGGC 



CGTTCTT 
GCAAGAA 



CACCCAA 
GTGGGTT 



TGCCGCA 
ACGGCGT 



TGAAGCA 
ACTTCGT 



Amp resistance gene 

TTTATCA GGGTTAT TGTCTCA TGAGCGG ATACATA TTTGAAT GTATTTA GAAAAAT AAACAAA TAGGGGT 
AAATAGT CCCAATA ACAGAGT ACTCGCC TATGTAT AAACTTA CATAAAT CTTTTTA TTTGTTT ATCCCCA 
TCCGCGC ACATTTC CCCGAAA AGTGCCA CCTGACG TCTAAGA AACCATT ATTATCA TGACATT AACCTAT 
AGGCGCG TGTAAAG GGGCTTT TCACGGT GGACTGC AGATTCT TTGGTAA TAATAGT ACTGTAA TTGGATA 
AAAAATA GGCGTAT CACGAG 
TTTTTAT CCGCATA GTGCTC 
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FIGURE 2A 



mCEA ( 6D 
mCEA (6D, lst&2nd 



mCEA ( 6D 
mCEA (6D, lst&2nd 



mCEA { 6D 
mCEA(6D,lst&2nd 



mCEA ( 6D 
mCEA(6D / lst&2nd 



mCEA ( 6D 
mCEA ( 6D , lst&2nd 



mCEA ( 6D 
mCEA ( 6D , 1 s t &2 nd 



mCEA { 6D 
mCEA (6D, 1st & 2nd 



mCEA ( 6D 
mCEA(6D, lst&2nd 



mCEA ( 6D 
mCEA ( 6D , lst&2nd 



mCEA ( 6D 
mCEA (6D, lst&2nd 



mCEA ( 6D 
mCEA (6D, lst&2nd 



mCEA (6D 
mCEA(6D, lst&2nd 



mCEA ( 6D 
mCEA ( 6D , lst&2nd 



1 50 
ATGGAGTCTC CCTCGGCCCC TCCCCACAGA TGGTGCATCC CCTGGCAGAG 
ATGGAGTCTC CCTCGGCCCC TCCCCACAGA TGGTGCATCC CCTGGCAGAG 

51 100 
GCTCCTGCTC ACAGCCTCAC TTCTAACCTT CTGGAACCCG CCCACCACTG 
GCTCCTGCTC ACAGCCTCAC TTCTAACCTT CTGGAACCCG CCCACCACTG 

101 150 
CCAAGCTCAC TATTGAATCC ACGCCGTTCA ATGTCGCAGA GGGGAAGGAG 
CCAAGCTCAC TATTGAATCC ACGCCGTTCA ATGTCGCAGA GGGGAAGGAG 

151 200 
GTGCTTCTAC TTGTCCACAA TCTGCCCCAG CATCTTTTTG GCTACAGCTG 
GTGCTTCTAC TTGTCCACAA TCTGCCCCAG CATCTTTTTG GCTACAGCTG 

201 250 
GTACAAAGGT GAAAGAGTGG ATGGCAACCG TCAAATTATA GGATATGTAA 
GTACAAAGGT GAAAGAGTGG ATGGCAACCG TCAAATTATA GGATATGTAA 



251 

TAGGAACTCA ACAAGCTACC CCAGGGCCCG 
TAGGAACTCA ACAAGCTACC CCAGGGCCCG 



300 

CATACAGTGG TCGAGAGATA 
CATACAGTGG TCGAGAGATA 



301 350 
ATATACCCCA ATGCATCCCT GCTGATCCAG AACATCATCC AGAATGACAC 
ATATACCCCA ATGCATCCCT GCTGATCCAG AACATCATCC AGAATGACAC 

351 400 
AGGATTCTAC ACCCTACACG TCATAAAGTC AGATCTTGTG AATGAAGAAG 
AGGATTCTAC ACCCTACACG TCATAAAGTC AGATCTTGTG AATGAAGAAG 

401 450 
CAACTGGCCA GTTCCGGGTA TACCCGGAGC TGCCCAAGCC CTCCATCTCC 
CAACTGGCCA GTTCCGGGTA TACCCGGAAC TCCCTAAGCC TTCTATTAGC 

451 500 
AGCAACAACT CCAAACCCGT GGAGGACAAG GATGCTGTGG CCTTCACCTG 
TCCAATAATA GTAAGCCTGT CGAAGACAAA GATGCCGTCG CTTTTACATG 

501 550 
TGAACCTGAG ACTCAGGACG CAACCTACCT GTGGTGGGTA AACAATCAGA 
CGAGCCCGAA ACTCAAGACG CAACATATCT CTGGTGGGTG AACAACCAGT 

551 600 
GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT C CAATGGCAA CAGGACCCTC 
CCCTGCCTGT GTCC CCTAGA CTCCAACTCA GCAACGGAAA TAGAACTCTG 

601 650 
ACTCTATTCA ATGTCACAAG AAATGACACA GCAAGCTACA AATGTGAAAC 
ACCCTGTTTA ACGTGACCAG GAACGACACA GCAAGCTACA AATGCGAAAC 
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FIGURE 2B 

651 700 
CCAGAACCCA GTGAGTGCCA GGCGCAGTGA TTCAGTCATC CTGAATGTCC 
CCAAAATCCA GTCAGCGCCA GGAGGTCTGA TTCAGTGATT CTCAACGTGC 

701 750 
TCTATGGCCC GGATGCCCCC ACCATTTCCC CTCTAAACAC ATCTTACAGA 
TTTACGGACG CGATGCTCCT ACAATCAGCC CTCTAAACAC AAGCT ATAGA 

751 800 
TCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA ACCCACCTGC 
TCAGGGGAAA ATCTGAATCT GAG CTGTCAT GCCGC TAGCA ATCCTCCCGC 

801 850 
ACAGTACTCT TGGTTTGTCA ATGGGACTTT CCAGCAATCC ACCCAAGAGC 
CCAATA CAGC TGGTTTGTCA ATGGGACTTT CCAACAGTCC ACCCAGGAAC 

851 500 
TCTTTATCCC CAACATCACT GTGAATAATA GTGGATCCTA TACGTGCCAA 
TGTTCATTCC CAATATTACC GTGAACAATA GTGGATCCTA CACGTGCCAA 

901 350 
GCCCATAACT CAGACACTGG CCTCAATAGG ACCACAGTCA CGACGATCAC 
GCTCACAATA GCGACACCGG ACTCAACCGC ACAACCGTGA CGACGATTAC 

951 1000 
AGTCTATGAG CCACCCAAAC CCTTCATCAC CAGCAACAAC TCCAACCCCG 
CGTGTATGAG CCACCAAAAC CAT TCATAAC TAGTAACAAT TCTAACCCAG 

1001 1050 
TGGAGGATGA GGATGCTGTA GCCTTAACCT GTGAACCTGA GATTCAGAAC 
TTGAGGATGA GGACGCAGTT GCATTAACTT GTGAGCCAGA GATTCAAAAT 

1051 1100 
ACAACCTACC TGTGGTGGGT AAATAATCAG AGCCTCCCGG TCAGTCCCAG 
ACCACTTATT TATGGTGGGT CAATAACCAA AGTTTGCCGG TTAGCCCACG 

1101 1150 
GCTGCAGCTG TCCAATGACA ACAGGACCCT CACTCTACTC AGTGTCACAA 
CTTGCAGTTG TCTAATGATA ACCGCACATT GACACTCCTG TCCG TTACTC 

1151 12 00 

GGAATGATGT AGGACCCTAT GAGTGTGGAA TCCAGAACGA. ATTAAGTGTT 
GCAATGATGT AGGACCTTAT GAGTGTGGCA TTCAGAATGA ATT ATCC GTT 

1201 I 250 
GACCACAGCG ACCCAGTCAT CCTGAATGTC CTCTATGGCC CAGACGACCC 
GATCACTCCG ACCCTGTTAT CCTTAATGTT TTGTATGGCC CAGACGACCC 

1251 1300 
CACCATTTCC CCCTCATACA CCTATTACCG TCCAGGGGTG AACCTCAGCC 
AACTATATCT CCATCATACA CCTACTACCG TCCCGGCGTG AACTTGAGCC 
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FIGURE 2C 
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mCEA(6D,lst&2nd 



mCEA ( 6D 
mCEA ( 6D, lst&2nd 



mCEA (6D 
mCEA(6D / lst&2nd 



mCEA ( 6D 
mCEA(6D / lst&2nd 



mCEA ( 6D 
mCEA<6D, lst&2nd 



1301 1350 
TCTCCTGCCA TGGAGCCTCT AACCCACCTG CACAGTATTC TTGGCTGATT 
TTTCTTGCCA TGCAGCATCC AACCCCCCTG CACAGTACTC CTGGCTGATT 

1351 1400 
GATGGGAACA TCCAGCAAGA CACACAAGAG CTCTTTATCT CCAACATCAC 
GATGGAAACA TTCAGCAGGA TACTCAAGAG TTATTTATAA GCAACATAAC 

1401 1450 
TGAGAAGAAC AGCGGACTCT ATACCTGCGA GGCCAATAAC TCAGCCAGTG 
TGAGAAGAAC AGCGGACTCT ATACTTGCCA GGCCAATAAC TCAGCCAGTG 

1451 1500 
GCCACAGCAG GACTACAGTC AAGACAATCA CAGTCTCTGC GGAGCTGCCC 
GTCACAGCAG GAC TACAGTT AAAACAATAA CTGTTTCCGC GGAGCTGCCC 

1501 1550 
AAGCCCTCCA TCTCCAGCAA CAACTCCAAA CCCGTGGAGG ACAAGGATGC 
AAGCCCTCCA TCTCCAGCAA CAACTCCAAA CCCGTGGAGG ACAAGGATGC 

1551 1600 
TGTGGCCTTC ACCTGTGAAC CTGAGGCTCA GAACACAACC TACCTGTGGT 
TGTGGCCTTC ACCTGTGAAC CTGAGGCTCA GAACACAACC TACCTGTGGT 

1601 1650 
GGGTAAATGG TCAGAGCCTC CCAGTCAGTC CCAGGCTGCA GCTGTCCAAT 
GGGTAAATGG TCAGAGCCTC CCAGTCAGTC CCAGGCTGCA GCTGTCCAAT 

1651 1700 
GGCAACAGGA CCCTCACTCT ATTCAATGTC ACAAGAAATG ACGCAAGAGC 
GGCAACAGGA CCCTCACTCT ATTCAATGTC ACAAGAAATG ACGCAAGAGC 

1701 1750 
CTATGTATGT GGAATCCAGA ACTCAGTGAG TGCAAACCGC AGTGACC CAG 
CTATGTATGT GGAATCCAGA ACTCAGTGAG TGCAAACCGC AGTGACCCAG 

1751 1800 
TCACCCTGGA TGTCCTCTAT GGGCCGGACA CCCCCATCAT TTCCCCCCCA 
TCACCCTGGA TGTCCTCTAT GGGCCGGACA CCCCCATCAT TTCCCCCCCA 

1801 1850 
GACTCGTCTT ACCTTTCGGG AGCGGACCTC AACCTCTCCT GCCACTCGGC 
GACTCGTCTT ACCTTTCGGG AGCGGACCTC AACCTCTCCT GCCACTCGGC 

1851 1900 
CTCTAACCCA TCCCCGCAGT ATTCTTGGCG TATCAATGGG ATACCGCAGC 
CTCTAACCCA TCCCCGCAGT ATTCTTGGCG TATCAATGGG ATACCGCAGC 

1901 1950 
AACACACACA AGTTCTCTTT ATCGCCAAAA TCACGC CAAA TAATAACGGG 
AACACACACA AGTTCTCTTT ATCGCCAAAA TCACGC CAAA TAATAACGGG 



9/22 



WO 2005/068640 



PCT/US2004/042980 



FIGURE 2D 



mCEA ( 6D ) 
mCEA(6D,lst&2nd) 



1951 2000 
ACCTATGCCT GTTTTGTCTC TAACTTGGCT ACTGGCCGCA ATAATTCCAT 
ACCTATGCCT GTTTTGTCTC TAACTTGGCT ACTGGCCGCA ATAATTCCAT 



mCEA ( 6D) 
mCEA ( 6D , lst&2nd) 



2001 2050 
AGTCAAGAGC ATCACAGTCT CTGCATCTGG AACTTCTCCT GGTCTCTCAG 
AGTCAAGAGC ATCACAGTCT CTGCATCTGG AACTTCTCCT GGTCTCTCAG 



mCEA ( 6D) 
mCEA(6D,lst&2iid) 



2051 2100 
CTGGGGCCAC TGTCGGCATC ATGATTGGAG TGCTGGTTGG GGTTGCTCTG 
CTGGGGCCAC TGTCGGCATC ATGATTGGAG TGCTGGTTGG GGTTGCTCTG 



mCEA { 6D ) 
mCEA ( 6D , Is t&2nd) 



2101 

ATATAG 

ATATAG 
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FIGURE 3 

A. Amino Acid Sequence Comparison of "Wild-Type KSA" (1) and Modified KSA (2) 

5 1 MAPPQVLAFGLLLAAATATFAAAQ 
2 MAPPQVLAFGLLLAAATATFAl^^ 

1 SKXlAAKCLVMKAEMNG 

2 SKLAAKCLVMKAEMNGSKLGRRAKPEGALQNNDGLYDPDOT 

10 

1 VNTAGVRRTDKDTEI TCS ERVRTYWI 1 1 ELKHKAREKP YDSKSLRTALQKEI TTRYQLD 

2 VNTAGVRRTDKDTEITCSERVRTYWI I IELKHKAREKPYDSKSLRTALQKEITTRYQLD 

1 PKFI TS ILYENNVI TI DLVQNS SQKTQNDVD I ADVAYYFEKDVKGESIiFHS KKMDLTVN 
15 2 PKFI TSVLYENNVI TI DLVQNS SQKTQNDVDI ADVAYYFEKDVKGESIiFHS KKMDLTVN 

1 GEQLDLDPGQTL I Y YVDEKAPEFSMQGLKA-GVI AVI VVVVI AVVAGI VVLVI SRKKRMA 

2 GEQLDLDPGQTLI YYVDEKAPEFSMQGLKAGVIAVI WWIAWAGI WLVI SRKKRMA 

20 1 KYEKAEI KEMGEMHRELNA 
2 KYEKAE I KEMGEMHRELNA 

B. DNA Sequence of Modified KSA 

atggcgcccccgcaggtcctcgcgttcgggcttctgcttgccgcggcgacggcgacttttgccgcagctcaggaa 
25 gaatgtgtctgtgaaaactacaagctggccgtaaactgctttgtgaataataatcgtcaatgccagtgtacttca 
gttggtgcacaaaatactgtcatttgctcaaagctggctgccaaatgtttggtgatgaaggcagaaatgaatggc 
tcaaaacttgggagaagagcaaaacctgaaggggccctccagaacaatgatgggctttatgatcctgactgcgat 
gagagcgggGtctttaaggccaagcagtgcaacggcacctccacgtgctggtgtgtgaacactgctggggtcaga 
agaacagacaaggacactgaaataacctgctctgagcgagtgagaacctactggatcatcattgaactaaaacac 
30 aaagcaagagaaaaaccttatgatagtaaaagtttgcggactgcacfctcagaaggagatcacaacgcgttatcaa 
ctggatccaaaatttatcacgagtgtgttgtatgagaataatgttatcactattgatctggttcaaaattcttct 
caaaaaactcagaatgatgtggacatagctgatgtggcttattattttgaaaaagatgttaaaggtgaatccttg 
tttcattctaagaaaatggacctgacagtaaatggggaacaactggatctggatcctggtcaaactttaatttat 
tatgttgatgaaaaagcacctgaattctcaatgcagggtctaaaagctggtgttattgctgttattgtggttgtg 
35 gtgatagcagttgttgctggaattgttgtgctggttatttccagaaagaagagaatggcaaagtatgagaaggct 
gagataaaggagatgggtgagatgcatagggaactcaatgcataa 
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FIGURE 4A 
Construction of Modified KSA Plasmid 
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FIGURE 4B 
Construction of Modified KSA Plasmid 




Right Arm 
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FIGURE 5 

A. Plasmid Map of Modified KSA Expression Vector 



H6 Promoter KSAV 




/ 



Right Arm 

B. DNA Sequence of Modified KSA Expression Vector 



Promoter H6 for KSAV 


9930-9515 


KSAV 


1-945 


Left arm 


1002-1422 


Right arm 


4070-5590 


Right arm fragment 


9012-9299 



MetAlaProPro GlnVallieu AlaPheGly LeuLeuLeuAla AlaAlaThr- 
ATGGCGCCCC CGCAGGTCCT CGCGTTCGGG CTTCTGCTTG CCGCGGCGAC 
TACCGCGGGG GCGTCCAGGA GCGCAAGCCC GAAGACGAAC GGCGCCGCTG 
.AlaThrPhe AlaAlaAlaGln GluGluCys ValCysGlu AsnTyrLysLeu • 
GGCGACTTTT GCCGCAGCTC AGGAAGAATG TGTCTGTGAA AACTACAAGC 
CCGCTGAAAA CGGCGTCGAG TCCTTCTTAC ACAGACACTT TTGATGTTCG 
. .AlaValAsn CysPheVal AsnAsnAsnArg GlnCysGln CysThrSer 
TGGCCGTAAA CTGCTTTGTG AATAATAATC GTCAATGCCA GTGTACTTCA 
ACCGGCATTT GACGAAACAC TTATTATTAG CAGTTACGGT CACATGAAGT 
ValGlyAlaGln AsnThrVal IleCysSer LysLeuAlaAla LysCysLeu- 
GTTGGTGCAC AAAATACTGT CATTTGCTCA AAGCTGGCTG CCAAATGTTT 
CAACCACGTG TTTTATGACA GTAAACGAGT TTCGACCGAC GGTTTACAAA 
.ValMetLys AlaGluMetAsn GlySerLys LeuGlyArg ArgAlaLysPro- 
GGTGATGAAG GCAGAAATGA ATGGCTCAAA ACTTGGGAGA AGAGCAAAAC 
CCACTACTTC CGTCTTTACT TACCGAGTTT TGAACCCTCT TCTCGTTTTG 
. . GluGlyAla LeuGlnAsn AsnAspGlyLeu TyrAspPro AspCysAsp 
CTGAAGGGGC CCTCCAGAAC AATGATGGGC TTTATGATCC TGACTGCGAT 
GACTTCCCCG GGAGGTCTTG TTACTACCCG AAATACTAGG ACTGACGCTA 
GluSerGlyLeu PheLysAla LysGlnCys AsnGlyThrSer ThrCysTrp- 
GAGAGCGGGC TCTTTAAGGC CAAGCAGTGC AACGGCACCT CCACGTGCTG 
CTCTCGCCCG AGAAATTCCG GTTCGTCACG TTGCCGTGGA GGTGCACGAC 
.CysValAsn ThrAlaGlyVal ArgArgThr AspLysAsp ThrGluIleThr - 
GTGTGTGAAC ACTGCTGGGG TCAGAAGAAC AGACAAGGAC AC TGAAATAA 
CACACACTTG TGACGACCCC AGTCTTCTTG TCTGTTCCTG TGACTTTATT 
..CysSerGlu ArgValArg ThrTyrTrpIle IlelleGlu LeuLysHis 



10 



20 



25 



51 



15 101 



151 



201 



251 



301 



30 351 
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4 01 CCTGCTCTGA GCGAGTGAGA ACCTACTGGA TCATCATTGA AC TAAAACAC 
GGACGAGACT CGCTCACTCT TGGATGACCT AGTAGTAACT TGATTTTGTG 
LysAlaArgGlu LysProTyr AspSerLys SerLeuArgThr AlaLeuGln- 
451 AAAGCAAGAG AAAAACCTTA TGATAGTAAA AGTTTGCGGA CTGCACTTCA 
5 TTTCGTTCTC TTTTTGGAAT ACTATCATTT TCAAACGCCT GACGTGAAGT 

.LysGluIle ThrThrArgTyr GliLLeuAsp ProLysPhe IleThxSerVal - 
501 GAAGGAGATC ACAACGCGTT ATCAACTGGA TCCAAAATTT AT CACGAGTG 
CTTCCTCTAG TGTTGCGCAA TAGTTGACCT AGGTTTTAAA TAGTGCTCAC 
. .LeuTyrGlu AsnAsnVal IleThrlleAsp LeuValGln AsnSerSer 
10 551 TGTTGTATGA GAATAATGTT ATCACTATTG ATCTGGTTCA AAATTCTTCT 

ACAACATACT CTTATTACAA TAGTGATAAC TAGACCAAGT TTTAAGAAGA 
GlnLysThrGln AsnAspVal AspIleAla AspValAlaTyx TyrPheGlu- 
601 CAAAAAACTC AGAATGATGT GGACATAGCT GATGTGGCTT ATTATTTTGA 
GTTTTTTGAG TCTTACTACA CCTGTATCGA CTACACCGAA TAATAAAACT 
15 .LysAspVal LysGlyGluSer LeuPheHis SerLysLys MetAspLeuTiir- 

651 AAAAGATGTT AAAGGTGAAT CCTTGTTTCA TTCTAAGAAA ATGGACCTGA 
TTTTCTACAA TTTCCACTTA GGAACAAAGT AAGATTGTTT TACCTGGACT 
, .ValAsnGly GluGlnLeu AspLeuAspPro GlyGlnThr LeuIleTyr 
701 CAGTAAATGG GGAACAACTG GATCTGGATC CTGGTCAAAC TTTAATTTAT 
20 GTCATTTACC CCTTGTTGAC CTAGACCTAG GACCAGTTTG AAATTAAATA 

TyrValAspGlu LysAlaPro GluPheSer MetGlnGlyLeu LysAlaGly- 
751 TATGTTGATG AAAAAGCACC TGAATTCTCA ATGCAGGGTC TAAAAGCTGG 
ATACAACTAC TTTTTCGTGG ACTTAAGAGT TACGTCCCAG ATTTTCGACC 
.VallleAla VallleValVal ValVallle AlaValVal AlaGlylleVal • 
25 8 01 TGTTATTGCT GT.TATTGTGG TTGTGGTGAT AGCAGTTGTT GCTGGAATTG 

ACAATAACGA CAATAACACC AACACCACTA TCGTCAACAA CGACCTTAAC 
. .ValLeuVal IleSerArg LysIrysArgMet AlaLysTyr GluLysAla 
851 TTGTGCTGGT TATTTCCAGA AAGAAGAGAA TGGCAAAGTA TGAGAAGGCT 
AACACGACCA ATAAAGGTCT TTCTTCTCTT ACCGTTTCAT ACTCTT CCGA 
30 GluIleLysGlu MetGlyGlu MetHisArg GluLeuAsnAla *** 

9 OX GAGATAAAGG AGATGGGTGA GATGCATAGG GAACTCAATG CATAAGAAGC 
CTCTATTTCC TCTACCCACT CTACGTATCC CTTGAGTTAC GTATTCTTCG 
951 TTATCGATAC CGTCGACCTC GAGGAATTCT TTTTATTGAT TAACTAGTTA 
AATAGCTATG GCAGCTGGAG CTCCTTAAGA AAAATAACTA ATTGATCAAT 
35 1001 ATCACGGCCG CTTATAAAGA TCTAAAATGC ATAATTTCTA AATAATGAAA 

TAGTGCCGGC GAATATTTCT AGATTTTACG TATTAAAGAT TTATTACTTT 
1051 AAAAAGTACA TCATGAGCAA CGCGTTAGTA TATTTTACAA TGGAGATTAA 
TTTTTCATGT AGTACTCGTT GCGCAATCAT ATAAAATGTT ACCTCTAATT 
1101 CGCTCTATAC CGTTCTATGT TTATTGATTC AGATGATGTT TTAGAAAAGA 
40 GCGAGATATG GCAAGATACA AATAACTAAG TCTACTACAA AATCTTTTCT 

1151 AAGTTATTGA ATATGAAAAC TTTAATGAAG ATGAAGATGA CGACGATGAT 
TTCAATAACT TATACTTTTG AAATTACTTC TACTTCTACT GCTGCTACTA 
1201 TATTGTTGTA AATCTGTTTT AGATGAAGAA GATGACGGGC TAAAGTATAC 
ATAACAACAT TTAGACAAAA TGTACTTCTT CTACTGCGCG ATTTCATATG 
45 1251 TATGGTTACA AAGTATAAGT CTATACTAGT AATGGCGACT TGTGCAAGAA 

ATACCAATGT TTCATATTCA GATATGATGA TTAC CGCTGA ACACGTTCTT 
1301 GGTATAGTAT AGTGAAAATG TTGTTAGATT ATGATTATGA AAAACCAAAT 
CCATATCATA TCACTTTTAC AACAATCTAA TACTAATACT TTTTGGTTTA 
1351 AAATCAGATC CATATCTAAA GGTATCTCCT TTGCACATAA TTTCATCTAT 
50 TTTAGTCTAG GTATAGATTT CCATAGAGGA AACGTGTATT AAAGTAGATA 

14 01 TCCTAGTTTA GAATACCTGC AGCCAAGCTT GGCACTGGCC GTCGTTTTAC 
AGGATCAAAT CTTATGGAGG TCGGTTCGAA CCGTGACCGG CAGCAAAATG 
1451 AACGTCGTGA CTGGGAAAAC CCTGGCGTTA CCCAACTTAA TCGCCTTGCA 
TTGCAGCACT GACCCTTTTG GGACCGCAAT GGGTTGAATT AGCGGAACGT 
55 1501 GCACATCCCC CTTTCGCCAG CTGGCGTAAT AGCGAAGAGG CCCGCACCGA 

CGTGTAGGGG GAAAGCGGTC GACCGCATTA TCGCTTCTCC GGGCGTGGCT 
1551 TCGCGCTTCC CAACAGTTGC GCAGCCTGAA TGGCGAATGG CGCCTGATGC 
AGCGGGAAGG GTTGTCAACG CGTCGGACTT ACCGCTTACC GCGGACTACG 
1601 GGTATTTTCT CCTTACGGAT CTGTGCGGTA TTTCACACCG CATATGGTGC 



15/22 



WO 2005/068640 



PCT/US2004/042980 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 



C CAT AAAAG A 
ACTCTCAGTA 
TGAGAGTCAT 
CCCGCCAACA 
GGGCGGTTGT 
CCGCTTACAG 
GGCGAATGTC 
TTTTCACCGT 
AAAAGTGGCA 
CCTATTTTTA 
GGATAAAAAT 
GTGGCACTTT 
CACCGTGAAA 
TAAATACATT 
ATTTATGTAA 
GCTTCAATAA 
CGAAGTTATT 
TCGCCCTTAT 
AGCGGGAATA 
CCAGAAACGC 
GGTCTTTGCG 
AGTGGGTTAC 
TCACCCAATG 
TTCGCCCCGA 
AAGCGGGGCT 
TGTGGCGCGG 
ACACCGCGCC 
CCGCATACAC 
GGCGTATGTG 
AAAAGCATCT 
TTTTCGTAGA 
ATAACCATGA 
TATTGGTACT 
AGGACGGAAG 
TCCTGGCTTC 
GTCGCCTTGA 
GAGCGGAACT 
GAGCGTGACA 
CTCGCACTGT 
ATTAACTGGC 
TAATTGACCG 
GGATGGAGGC 
CCTACCTCCG 
GCTGGCTGGT 
CGACCGACCA 
GGGTATCATT 
GCCATAGTAA 
TTATCTACAC 
AATAGATGTG 
ATCGCTGAGA 
TAGCGACTCT 
AGTTTACTCA 
TCAAATGAGT 
AAAGGATCTA 
TTTCCTAGAT 
TAACGTGAGT 
ATTGCACTCA 
AGGATCTTCT 
TCCTAGAAGA 



GGAATGCGTA 
CAATCTGCTC 
GTTAGACGAG 
CCCGCTGACG 
GGGCGACTGC 
ACAAGCTGTG 
TGTTCGACAC 
CATCACCGAA 
GTAGTGGCTT 
TAGGTTAATG 
ATCCAATTAC 
TCGGGGAAAT 
AGCCCCTTTA 
CAAATATGTA 
GTTTATACAT 
TATTGAAAAA 
ATAACTTTTT 
TCCCTTTTTT 
AGGGAAAAAA 
TGGTGAAAGT 
ACCACTTTCA 
ATCGAACTGG 
TAGCTTGACC 
AGAACGTTTT 
TCTTGCAAAA 
TATTATCCCG 
ATAATAGGGC 
TATTCTCAGA 
ATAAGAGTCT 
TACGGATGGC 
ATGCCTACCG 
GTGATAACAC 
CACTATTGTG 
GAGCTAACCG 
CTCGATTGGC 
TCGTTGGGAA 
AGCAACCCTT 
CCACGATGCC 
GGTGCTACGG 
GAACTACTTA 
CTTGATGAAT 
GGATAAAGTT 
CCTATTTCAA 
TTATTGCTGA 
AATAACGACT 
GCAGCACTGG 
CGTCGTGACC 
GACGGGGAGT 
CTGCCCCTCA 
TAGGTGCCTC 
ATCCAGGGAG 
TATATACTTT 
ATATATGAAA 
GGTGAAGATC 
CCACTTCTAG 
TTTCGTTCCA 
AAAGCAAGGT 
TGAGATCCTT 
ACTCTAGGAA 



GACACGCCAT 
TGATGCCGCA 
ACTACGGCGT 
CGCCCTGACG 
GCGGGACTGC 
ACCGTCTCCG 
TGGCAGAGGC 
ACGCGCGAGA 
TGCGCGCTCT 
TCATGATAAT 
AGTACTATTA 
GTGCGCGGAA 
CACGCGCCTT 
TGCGCTCATG 
AGGCGAGTAC 
GGAAGAGTAT 
CCTTCTCATA 
GCGGCATTTT 
CGCCGTAAAA 
AAAAGATGCT 
TTTTCTACGA 
ATCTCAACAG 
TAGAGTTGTC 
CCAATGATGA 
GGTTACTACT 
TATTGACGCC 
ATAACTGCGG 
ATGACTTGGT 
TACTGAACGA 
ATGACAGTAA 
TACTGTCATT 
TGCGGCCAAC 
ACGCCGGTTG 
CTTTTTTGCA 
GAAAAAACGT 
CCGGAGCTGA 
GGCCTCGACT 
TGTAGCAATG 
ACATCGTTAC 
CTCTAGCTTC 
GAGATCGAAG 
GCAGGACCAC 
CGTCCTGGTG 
TAAATCTGGA 
ATTTAGACCT 
GGCCAGATGG 
CCGGTCTACC 
CAGGCAACTA 
GTCCGTTGAT 
ACTGATTAAG 
TGACTAATTC 
AGATTGATTT 
TCTAACTAAA 
CTTTTTGATA 
GAAAAACTAT 
CTGAGCGTCA 
GACTCGCAGT 
TTTTTCTGCG 
AAAAAGACGC 



AAAGTGTGGC 
TAGTTAAGCC 
ATCAATTCGG 
GGCTTGTCTG 
CCGAACAGAC 
GGAGCTGCAT 
CCTCGACGTA 
CGAAAGGGCC 
GCTTTCCCGG 
AATGGTTTCT 
TTACCAAAGA 
CCCCTATTTG 
GGGGATAAAC 
AGACAATAAC 
TCTGTTATTG 
GAGTATTCAA 
CTCATAAGTT 
GCCTTCCTGT 
CGGAAGGACA 
GAAGATCAGT 
CTTCTAGTCA 
CGGTAAGATC 
GCCATTCTAG 
GCACTTTTAA 
CGTGAAAATT 
GGGCAAGAGC 
CCCGTTCTCG 
TGAGTACTCA 
ACTCATGAGT 
GAGAATTATG 
CTCTTAATAC 
TTACTTCTGA 
AATGAAGACT 
CAACATGGGG 
GTTGTACCCC 
ATGAAGCCAT 
TACTTCGGTA 
GCAACAACGT 
CGTTGTTGCA 
CCGGCAACAA 
GGCCGTTGTT 
TTCTGCGCTC 
AAGACGCGAG 
GCCGGTGAGC 
CGGCCACTCG 
TAAGCCCTCC 
ATTCGGGAGG 
TGGATGAACG 
ACCTACTTGC 
CATTGGTAAC 
GTAAC CATTG 
AAAAC TTCAT 
TTTTGAAGTA 
ATCTCATGAC 
TAGAGTAGTG 
GACCCCGTAG 
CTGGGGCATC 
CGTAATCTGC 
GCATTAGACG 



GTATACCACG 
AGCCCCGACA 
TCGGGGCTGT 
CTCCCGGCAT 
GAGGGCCGTA 
GTGTCAGAGG 
CACAGTCTCC 
TCGTGATACG 
AGCACTATGC 
TAGACGTCAG 
ATCTGCAGTC 
TTTATTTTTC 
AAATAAAAAG 
CCTGATAAAT 
GGACTATTTA 
CATTTCCGTG 
GTAAAGGCAC 
TTTTGCTCAC 
AAAACGAGTG 
TGGGTGCACG 
ACCCACGTGC 
OTTGAGAGTT 
GAACTCTCAA 
AGTTCTGCTA 
TCAAGACGAT 
AACTCGGTCG 
TTGAGCCAGC 
CCAGTCACAG 
GGTCAGTGTC 
CAGTGCTGCC 
GTCACGACGG 
CAACGATCGG 
GTTGCTAGCC 
GATCATGTAA 
CTAGTACATT 
ACCAAACGAC 
TGGTTTGCTG 
TGCGCAAACT 
ACGCGTTTGA 
TTAATAGACT 
AATTATCTGA 
GGCCCTTCCG 
CCGGGAAGGC 
GTGGGTCTCG 
CACCCAGAGC 
CGTATCGTAG 
GCATAGCATC 
AAATAGACAG 
TTTATCTGTC 
TGTCAGACCA 
ACAGTCTGGT 
TTTTAATTTA 
AAAATTAAAT 
CAAAATCCCT 
GTTTTAGGGA 
AAAAGATCAA 
TTTTCTAGTT 
TGCTTGCAAA 
ACGAACGTTT 
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3101 
3151 
5 3201 
3251 
3301 

10 

3351 
3401 
15 3451 
3501 
3551 

20 

3601 
3651 
25 3701 
3751 
3801 

30 

3851 
3901 
35 3 951 
4001 
4051 

40 

4101 
4151 
45 4 2 01 
4251 
4301 

50 

4351 
4401 
55 4451 
4501 
4551 



CAAAAAAACC 
GTTTTTTTGG 
CCAACTCTTT 
GGTTGAGAAA 
TACTGTCCTT 
ATGACAGGAA 
TAGCACCGCC 
ATCGTGGGGG 
GCCAGTGGCG 
CGGTCACCGC 
ACCGGATAAG 
TGGCCTATTC 
CCAGCTTGGA 
GGTCGAACCT 
CTATGAGAAA 
GATACTCTTT 
GGTAAGCGGC 
CCATTCGCCG 
GAAACGCCTG 
CTTTGCGGAC 
GAGCGTCGAT 
CTCGCAGCTA 
CGCCAGCAAC 
GCGGTCGTTG 
CTCACATGTT 
GAGTGTACAA 
ACCGCCTTTG 
TGGCGGAAAC 
CAGCGAGTCA 
GTCGCTCAGT 
CTCTCCCCGC 
GAGAGGGGCG 
CCCGACTGGA 
GGGCTGACCT 
CACTCATTAG 
GTGAGTAATC 
TGTGTGGAAT 
ACACACCTTA 
CATGATTACG 
GTACTAATGC 
CTTTGGATGA 
GAAAGCTACT 
AGGATTCAAA 
TCCTAAGTTT 
TCTTAACGAC 
AGAATTGCTG 
TAACAAATAA 
ATTGTTTATT 
TTATTTTACT 
AATAAAATGA 
ACTGTTATCG 
TGACAATAGC 
ATAAGATTAC 
TATTCTAATG 
AGTGATAAAT 
TCACTATTTA 
ATTTGACAGA 
TAAACTGTCT 
TATCGGAAGA 



ACCGCTACCA 
TGGCGATGGT 
TTCCGAAGGT 
AAGGCTTCCA 
CTAGTGTAGC 
GATCACATCG 
TACATACCTC 
ATGTATGGAG 
ATAAGTCGTG 
TATTCAGCAC 
GCGCAGCGGT 
CGCGTCGCCA 
GCGAACGACC 
CGCTTGCTGG 
GCGCCACGCT 
CGCGGTGCGA 
AGGGTCGGAA 
TCCCAGCCTT 
GTATCTTTAT 
CATAGAAATA 
TTTTGTGATG 
AAAACACTAC 
GCGGCCTTTT 
CGCCGGAAAA 
CTTTCCTGCG 
GAAAGGACGC 
AGTGAGCTGA 
TCACTCGACT 
GTGAGCGAGG 
CACTCGCTCC 
GCGTTGGCCG 
CGCAACCGGC 
AAGCGGGCAG 
TTCGCCCGTC 
GCACCCCAGG 
CGTGGGGTCC 
TGTGAGCGGA 
ACACTCGCCT 
AATTGAATTG 
TTAACTTAAC 
AGCTATAAAT 
TCGATATTTA 
TACTACAAAA 
ATGATGTTTT 
GCTTTAAATA 
CGAAATTTAT 
CTAAAACATA 
GATTTTGTAT 
CAGGAATGGG 
GTGCTTACCC 
TATACTCTTT 
ATATGAGAAA 
GTATTTAAGA 
CATAAATTCT 
GGTATTTCGG 
CGATAAAGCG 
TGTAACTTAA 
ACATTGAATT 
TAGGATACCA 



GCGGTGGTTT 
CGCCACCAAA 
AACTGGCTTC 
TTGACCGAAG 
CGTAGTTAGG 
GCATCAATCC 
GCTGTGCTAA 
CGAGACGATT 
TGTTAGCGGG 
AGAATGGGCC 
CGGGCTGAAC 
GCCCGACTTG 
TACACCGAAC 
ATGTGGCTTG 
TCCCGAAGGG 
AGGGCTTCCC 
CAGGAGAGCG 
GTCCTCTCGC 
AGTCCTGTCG 
TCAGGACAGC 
CTCGTCAGGG 
GAGCAGTCCC 
TACGGTTCCT 
ATGCCAAGGA 
TTATCCCCTG 
AATAGGGGAC 
TACCGCTCGC 
ATGGCGAGCG 
AAGCGGAAGA 
TTCGCCTTCT 
ATTCATTAAT 
TAAGTAATTA 
TGAGCGCAAC 
ACTCGCGTTG 
CTTTACACTT 
GAAATGTGAA 
TAACAATTTC 
ATTGTTAAAG 
CGGCCGCAAT 
GCCGGCGTTA 
ATGCATTGGA 
TACGTAACCT 
CCTAAGCGAT 
GGATTCGCTA 
TACACAAATA 
ATGTGTTTAT 
AAAATAATAA 
TTTTATTATT 
GTTAAATATT 
CAATTTATAA 
ACAATTACTA 
TGTTAATGAT 
GAATCTTGTC 
CTTAGAACAG 
ATCGTTACAT 
TAGCAATGTA 
TAGGTGCAAA 
ATCCACGTTT 
GTTATATTAT 



GTTTGCCGGA 
CAAACGGCCT 
AGCAGAGCGC 
TCGTCTCGCG 
CCACCACTTC 
GGTGGTGAAG 
TCCTGTTACC 
AGGACAATGG 
TTGGACTCAA 
AAGCTGAGTT 
GGGGGGTTCG 
CCCdCCAAGC 
TGAGATACCT 
ACT CTATGGA 
AGAAAGGCGG 
TCTTTCCGCC 
CACGAGGGAG 
GTGCTCCCTC 
GGTTTCGCCA 
CCAAAGCGGT 
GGGCGGAGCC 
CCCGCCTCGG 
GGCCTTTTGC 
CCGGAAAACG 
ATTCTGTGGA 
TAAGACACCT 
CGCAGGCGAA 
GCGTCGGCTT 
GCGCCCAATA 
CGCGGGTTAT 
GCAGCTGGCA 
CGTCGACCGT 
GCAATTAATG 
CGTTAATTAC 
TATGCTTCCG 
ATACGAAGGC 
ACACAGGAAA 
TGTGTCCTTT 
TCTGAATGTT 
AGACTTACAA 
AAAATAATCC 
TTTTATTAGG 
AATATGTTAA 
TTATACAATT 
AACATAATTT 
TTGTATTAAA 
AAGGAAATGT 
TTCCTTTACA 
TATATCACGT 
ATATAGTGCA 
TTACGAATAT 
AATGCTTATA 
ATGATAATTG 
TACTATTAAC 
AAAGTCAGTT 
TTTCAGTCAA 
AATGTTAAAT 
TTACAATTTA 
ACAAAAATCA 



TCAAGAGCTA 
AGTTCTCGAT 
AGATACCAAA 
TCTATGGTTT 
AAGAACTCTG 
TTCTTGAGAC 
AGTGGCTGCT 
TCACCGACGA 
GACGATAGTT 
CTGCTATCAA 
TGCACACAGC 
ACGTGTGTCG 
ACAGCGTGAG 
TGTCGCACTC 
ACAGGTATCC 
TGTCCATAGG 
CTTCCAGGGG 
GAAGGTCCCC 
CCTCTGACTT 
GGAGACTGAA 
TATGGAAAAA 
ATACCTTTTT 
TGGCCTTTTG 
ACCGGAAAAC 
TAACCGTATT 
ATTGGCATAA 
CGACCGAGCG 
GCTGGCTCGC 
CGCAAACCGC 
GCGTTTGGCG 
CGACAGGTTT 
GCTGTCCAAA 
TGAGTTAGCT 
ACTCAATCGA 
GCTCGTATGT 
CGAGCATACA 
CAGCTATGAC 
GTCGATACTG 
AAATGTTATA 
TTTACAATAT 
ATTTAAAGAA 
TAAATTTCTT 
CTAAGCTTAT 
GATTCGAATA 
TTGTATAACC 
AACATATTGG 
AATATCGTAA 
TTATAGCATT 
GTATATCTAT 
CATATAGATA 
GCAAGAGATA 
CGTTCTCTAT 
GGTACGACAT 
CCATGCTGTA 
GGAAAGATGG 
CCTTTCTACC 
AACAGCATTC 
TTGTCGTAAG 
CTGGTTGGAT 
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GGAAACGCTT 
AATACTGGCA 
TTATGACCGT 
TGGGACTGGG 
ACCCTGACCC 
GTGGTCGGCT 
CACCAGCCGA 
TCTGTATGAA 
AGACATACTT 
ACGGAAGCAA 
TGCCTTCGTT 
AACCATCGAA 
TTGGTAGCTT 
TCCTGCACTG 
AGGACGTGAC 
GTGCCTCTGG 
CACGGAGACC 
ACTACCGCAG 
TGATGGCGTC 
TGCAACCGAA 
ACGTTGGCTT 
CAGCAGTGGC 
GTCGTCACCG 
CCACGCCATC 
GGTGCGGTAG 
TGGGTAATAA 
ACCCATTATT 
ATGTGGATTG 
TACACCTAAC 
CACCCGTGCA 
GTGGGCACGT 
TTGACCCTAA 
AACTGGGATT 
GCCGAAGCAG 
CGGCTTCGTC 
GCTGATTACG 
CGACTAATGC 
TCAGCCGGAA 
AGTCGGCCTT 
GTTGATGTTG 
CAACTACAAC 
GAACTGCCAG 
CTTGACGGTC 
GGCCGCAAGA 
CCGGCGTTCT 
TGGGATCTGC 
ACCCTAGACG 
AAACGGTCTG 
TTTGCCAGAC 
GGCGCGGCGA 
CCGCGCGGCT 
ATGGAAACCA 
TACCTTTGGT 
GAATATCGAC 
CTTATAGCTG 
CGTCAGTATC 
GCAGTGATAG 
TTGGTCTGGT 
AACCAGACCA 



ATGCGGGTGC 
GGCGTTTCGT 
CCGCAAAGCA 
TGGATCAGTC 
ACCTAGTCAG 
TACGGCGGTG 
ATGCCGCCAC 
CGGTCTGGTC 
GCCAGACCAG 
AACACCAGCA 
TTGTGGTCGT 
GTGACCAGCG 
CACTGGTCGC 
GATGGTGGCG 
CTACCACCGC 
ATGTCGCTCC 
TACAGCGAGG 
CCGGAGAGCG 
GGCCTCTCGC 
CGCGACCGCA 
GCGCTGGCGT 
GTCTGGCGGA 
CAGACCGCCT 
CCGCATCTGA 
GGCGTAGACT 
GCGTTGGCAA 
CGCAACCGTT 
GCGATAAAAA 
CGCTATTTTT 
CCGGTGGATA 
GGCGAC CTAT 
CGCCTGGGTC 
GCGGACCCAG 
CGTTGTTGCA 
GCAACAACGT 
ACGGCTCACG 
TGGCGAGTGC 
AACCTACCGG 
TTGGATGGCC 
AAGTGGCGAG 
TTCACCGCTC 
CTGGCGCAGG 
GACCGCGTCC 
AAACTATCCC 
TTTGATAGGG 
CATTGTCAGA 
GTAACAGTCT 
CGCTGCGGGA 
GCGACGCCCT 
CTTCCAGTTC 
GAAGGTCAAG 
GCCATCGCCA 
CGGTAGCGGT 
GGTTTCCATA 
CCAAAGGTAT 
GGCGGAATTC 
CCGCCTTAAG 
GTCAAAAATA 
CAGTTTTTAT 



GCTACCCATT 
CAGTATCCCC 
GTCATAGGGG 
GCTGATTAAA 
CGACTAATTT 
ATTTTGGCGA 
TAAAACCGCT 
TTTGCCGACC 
AAACGGCTGG 
GCAGTTTTTC 
CGTCAAAAAG 
AATACCTGTT 
TTATGGACAA 
CTGGATGGTA 
GACCTACCAT 
ACAAGGTAAA 
TGTTCCATTT 
CCGGGCAACT 
GGCCCGTTGA 
TGGTCAGAAG 
ACCAGTCTTC 
AAACCTCAGT 
TTTGGAGTCA 
CCACCAGCGA 
GGTGGTCGCT 
TTTAACCGCC 
AAATTGGCGG 
ACAACTGCTG 
TGTTGACGAC 
ACGACATTGG 
TGCTGTAACC 
GAACGCTGGA 
CTTGCGACCT 
GTGCACGGCA 
CACGTGCCGT 
CGTGGCAGCA 
GCACCGTCGT 
ATTGATGGTA 
TAACTACCAT 
CGATACACCG 
GCTATGTGGC 
TAGCAGAGCG 
ATCGTCTCGC 
GACCGCCTTA 
CTGGCGGAAT 
CATGTATACC 
GTACATATGG 
CGCGCGAATT 
GCGCGCTTAA 
AACATCAGCC 
TTGTAGTCGG 
TCTGCTGCAC 
AGACGACGTG 
TGGGGATTGG 
ACCCCTAACC 
CAGCTGAGCG 
GTCGACTCGC 
ATAATAACCG 
TATTATTGGC 



GTCAGAACCG 
GTTTACAGGG 
CAAATGTCCC 
TATGATGAAA 
ATAOTACTTT 
TACGCCGAAC 
ATGCGGCTTG 
GCACGCCGCA 
CGTGCGGCGT 
CAGTTCCGTT 
GTCAAGGCAA 
CCGTCATAGC 
GGCAGTATCG 
AGCCGCTGGC 
TCGGCGACCG 
CAGTTGATTG 
GTCAACTAAC 
CTGGCTCACA 
GACCGAGTGT 
CCGGGCACAT 
GGGCCGTGTA 
GTGAGGCTCC 
CACTGCGAGG 
AATGGATTTT 
TTACCTAAAA 
AGTCAGGCTT 
TCAGTCCGAA 
ACGCCGCTGC 
TGCGGCGACG 
CGTAAGTGAA 
GCATTCACTT 
AGGCGGCGGG 
TCCGCCGCCC 
GATACACTTG 
CTATGTGAAC 
TCAGGGGAAA 
AGTCCCCTTT 
GTGGTCAAAT 
CACCAGTTTA 
CATCCGGCGC 
GTAGGCCGCG 
GGTAAACTGG 
CCATTTGACC 
CTGCCGCCTG 
GACGGCGGAC 
CCGTACGTCT 
GGCATGCAGA 
GAATTATGGC 
CTTAATACGG 
GCTACAGTCA 
CGATGTCAGT 
GCGGAAGAAG 
CGCCTTCTTC 
TGGCGACGAC 
ACCGCTGCTG 
CCGGTCGCTA 
GGCCAGCGAT 
GGCAGGGGGG 
CCGTCCCCCC 



CCAAAGCGAT 
CGGCTTCGTC 
GCCGAAGCAG 
ACGGCAACCC 
TGCCGTTGGG 
GATCGCCAGT 
CTAGCGGTCA 
TCCAGCGCTG 
AGGTCGCGAC 
TATCCGGGCA 
ATAGGCCCGT 
GATAACGAGC 
CTATTGCTCG 
AAGCGGTGAA 
TTCGCCACTT 
AACTGCCTGA 
TTGACGGACT 
GTACGCGTAG 
CATGCGCATC 
CAGCGCCTGG 
GTCGCGGACC 
CCGCCGCGTC 
GGCGGCGCAG 
TGCATCGAGC 
ACGTAGCTCG 
TCTTTCACAG 
AGAAAGTGTC 
GCGATCAGTT 
CGCTAGTCAA 
GCGACCCGCA 
CGCTGGGCGT 
CCATTACCAG 
GGTAATGGTC 
CTGATGCGGT 
GACTACGCCA 
ACCTTATTTA 
TGGAATAAAT 
GGCGATTACC 
CCGCTAATGG 
GGATTGGCCT 
CCTAACCGGA 
CTCGGATTAG 
GAGCCTAATC 
TTTTGACCGC 
AAAACTGGCG 
TCCCGAGCGA 
AGGGCTCGCT 
CCACACCAGT 
GGTGTGGTCA 
ACAGCAACTG 
TGTCGTTGAC 
GCACATGGCT 
CGTGTACCGA 
TCCTGGAGCC 
AGGACCTCGG 
CCATTACCAG 
GGTAATGGTC 
ATCCGGAGCT 
TAGGCCTCGA 
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9001 TAT CGCAGAT CAATGATCGC TGTACAATCT GGAAATATTG AAATATGTAG 
ATAGCGTCTA GTTACTAGCG ACATGTTAGA CCTTTATAAC TTTATACATC 
9051 CACACTACTT AAAAAAAATA AAATGTCCAG AACTGGGAAA AATTGATCTT 
GTGTGATGAA TTTTTTTTAT TTTACAGGTC TTGACCCTTT TTAACTAGAA 
5 9101 GCCAGCTGTA ATTCATGGTA GAAAAGAAGT GCTCAGGCTA CTTTTCAACA 

CGGTCGACAT TAAGTACCAT CTTTTCTTCA CGAGTCCGAT GAAAAGTTGT 
9151 AAGGAGCAGA TGTAAACTAC ATCTTTGAAA GAAATGGAAA ATCATATACT 
TTCCTCGTCT ACATTTGATG TAGAAACTTT CTTTACCTTT TAGTATATGA 
9201 GTTTTGGAAT TGATTAAAGA AAGTTACTCT GAGACACAAA AGAGGTAGCT 
10 CAAAACCTTA ACTAATTTCT TTCAATGAGA CTCTGTGTTT TCTCCATCGA 

9251 GAAGTGGTAC TCTCAAAGGT ACGTGACTAA TTAGCTATAA AAAGGATCCG 
CTTCACCATG AGAGTTTCCA TGCACTGATT AATCGATATT TTTCCTAGGC 
9301 GTACCCTCGA GTCTAGAATC GATCCCGGGT TAATTAATTA GTTATTAGAC 
CATGGGAGCT CAGATCTTAG CTAGGGCCCA ATTAATTAAT CAATAATCTG 
15 9351 AAGGTGAAAA CGAAACTATT TGTAGCTTAA TTAATTAGAG CTTCTTTATT 

TTCCACTTTT GCTTTGATAA ACATCGAATT AATTAATCTC GAAGAAATAA 
9401 CTATACTTAA AAAGTGAAAA TAAATACAAA GGTTCTTGAG GGTTGTGTTA 
GATATGAATT TTTCACTTTT ATTTATGTTT CCAAGAACTC CCAACACAAT 
9451 AATTGAAAGC GAGAAATAAT CATAAATTAT TTCATTATCG CGATATCCGT 
20 TTAACTTTCG CTCTTTATTA GTATTTAATA AAGTAATAGC GCTATAGGCA 

9501 TAAGTTTGTA TCGTA 
ATTCAAACAT AGCAT 
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FIGURE 6 
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